Llama 3.3 70B

by Meta

Flagship Free & Open Source 🏆 Ranked #36 of 85

77.0

Overall Score

out of 100

About

Meta's latest 70B model, matching Llama 3.1 405B quality at a fraction of the compute cost. Llama 3.3 70B is the go-to open-source model for users with a single consumer GPU capable of running 70B weights.

Key Metrics

Context Window

128K

tokens

Avg Response

1100

milliseconds

Input Cost

$0.59

per million tokens

Output Cost

$0.99

per million tokens

Arena ELO

1256

Chatbot Arena rating

MT-Bench

9.0

out of 10

Benchmark Scores

MMLU

86.0%

HumanEval

88.0%

MATH

77.0%

GPQA

50.5%

MT-Bench

90.0/10

Capability Profile

Strengths & Limitations

Strengths

Limitations

Ideal Use Cases

Model Details

Provider Meta

Released 2024-12-06

Type Free & Open Source

Multimodal No

Tier Flagship

Global rank #36 / 85

Pricing (USD)

Input tokens $0.59/M

Output tokens $0.99/M

Per 1,000 tokens ≈ $0.0006 input / $0.0010 output

All Benchmarks

MMLU 86.0%

HumanEval 88.0%

MATH 77.0%

GPQA 50.5%

MT-Bench 9.0/10

Arena ELO 1256

Compare this model View Rankings

Similar Models

You might also consider

OpenAI's most powerful reasoning model, using extended chain-of-thought to tackle the hardest problems in mathematics, science, and coding. o3 sets new standards on GPQA and competitive maths at the cost of higher latency and price.

Anthropic's most powerful and intelligent model, built for the most demanding tasks where quality outweighs cost. Claude Opus 4 leads on complex multi-step reasoning, graduate-level science, and nuanced long-form writing.

xAI's most capable model, trained on a 100,000-GPU cluster and setting new benchmarks in mathematics and scientific reasoning. Grok 3 integrates real-time data from the X platform and leads the Arena ELO leaderboard among commercial models.