← Back to all models
L

Llama 4 Scout

by Meta
Efficient Open source Multimodal 🏆 Ranked #18 of 22
74.1
Overall Score
out of 100
About

Meta's efficient Llama 4 model optimised for speed and cost. Despite being the lighter of the two Llama 4 releases, Scout achieves strong benchmark results and features an extraordinary 10 million token context window — the largest of any model.

Key Metrics
Context Window
10.0M
tokens
Avg Response
680
milliseconds
Input Cost
$0.08
per million tokens
Output Cost
$0.3
per million tokens
Arena ELO
1248
Chatbot Arena rating
MT-Bench
8.6
out of 10
Benchmark Scores
MMLU
87.1%
HumanEval
86.5%
MATH
67.4%
GPQA
47.1%
MT-Bench
86.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ 10M context window ✓ Open source ✓ Fast and efficient ✓ MoE architecture ✓ Low API cost
Limitations
⚠ Less capable than Maverick on hard tasks ⚠ Very long contexts increase latency ⚠ Requires optimised hardware for full context
Ideal Use Cases
Massive document processing Codebase-wide analysis Research indexing Efficient at-scale deployments Long-horizon tasks
Model Details
Provider Meta
Released 2025-04-05
Open source Yes
Multimodal Yes
Tier Efficient
Global rank #18 / 22
Pricing (USD)
Input tokens $0.08/M
Output tokens $0.3/M
Per 1,000 tokens ≈ $0.0001 input / $0.0003 output
All Benchmarks
MMLU 87.1%
HumanEval 86.5%
MATH 67.4%
GPQA 47.1%
MT-Bench 8.6/10
Arena ELO 1248
Compare this model View leaderboard

You might also consider