Compare Models

Select two models to compare their benchmarks, capabilities, pricing, and performance metrics.

VS
Capability Radar
G
Gemini 1.5 Pro
Google DeepMind
Flagship Multimodal
74.4
Score
Overall Score
74.4
Human Votes (Arena ELO)
1266
General Knowledge (MMLU)
85.9%
Coding (HumanEval)
84.1%
Maths (MATH)
67.7%
Science (GPQA)
46.2%
Conversation (MT-Bench)
8.9/10
Context Window
1.0M
Avg Response
920ms
Input Cost / 1M
$3.5
Output Cost / 1M
$10.5
Free & Open Source
Paid API
Multimodal
✓ Yes
L
Llama 4 Scout
Meta
Efficient Open source Multimodal
74.1
Score
Overall Score
74.1
Human Votes (Arena ELO)
1248
General Knowledge (MMLU)
87.1%
Coding (HumanEval)
86.5%
Maths (MATH)
67.4%
Science (GPQA)
47.1%
Conversation (MT-Bench)
8.6/10
Context Window
10.0M
Avg Response
680ms
Input Cost / 1M
$0.08
Output Cost / 1M
$0.3
Free & Open Source
✓ Free
Multimodal
✓ Yes
Highest Overall Score
Gemini 1.5 Pro 🏆
Scores 74.4 vs 74.1 — leads by 0.3 points out of 100
💡 Llama 4 Scout is free & open source — worth considering if cost matters.