← Back to all models
L

Llama 3.3 70B

by Meta
Flagship Free & Open Source 🏆 Ranked #36 of 85
77.0
Overall Score
out of 100
About

Meta's latest 70B model, matching Llama 3.1 405B quality at a fraction of the compute cost. Llama 3.3 70B is the go-to open-source model for users with a single consumer GPU capable of running 70B weights.

Key Metrics
Context Window
128K
tokens
Avg Response
1100
milliseconds
Input Cost
$0.59
per million tokens
Output Cost
$0.99
per million tokens
Arena ELO
1256
Chatbot Arena rating
MT-Bench
9.0
out of 10
Benchmark Scores
MMLU
86.0%
HumanEval
88.0%
MATH
77.0%
GPQA
50.5%
MT-Bench
90.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Strong reasoning ✓ Efficient for size ✓ Open source ✓ Fine-tuneable ✓ Broad capability
Limitations
⚠ Requires high-VRAM GPU ⚠ Slower than smaller models ⚠ No official API SLA
Ideal Use Cases
On-premise deployment Research Privacy-sensitive tasks Fine-tuning Customer support
Model Details
Provider Meta
Released 2024-12-06
Type Free & Open Source
Multimodal No
Tier Flagship
Global rank #36 / 85
Pricing (USD)
Input tokens $0.59/M
Output tokens $0.99/M
Per 1,000 tokens ≈ $0.0006 input / $0.0010 output
All Benchmarks
MMLU 86.0%
HumanEval 88.0%
MATH 77.0%
GPQA 50.5%
MT-Bench 9.0/10
Arena ELO 1256
Compare this model View Rankings

You might also consider