Compare Models

Select two models to compare their benchmarks, capabilities, pricing, and performance metrics.

VS
Capability Radar
G
Gemini 1.5 Pro
Google DeepMind
Flagship Multimodal
74.4
Score
Overall Score
74.4
Arena ELO
1266
MMLU
85.9%
HumanEval
84.1%
MATH
67.7%
GPQA
46.2%
MT-Bench
8.9/10
Context Window
1.0M
Avg Response
920ms
Input Cost / 1M
$3.5
Output Cost / 1M
$10.5
Open source
✗ No
Multimodal
✓ Yes
G
GPT-4o
OpenAI
Flagship Multimodal
79.5
Score
Overall Score
79.5
Arena ELO
1285
MMLU
88.7%
HumanEval
90.2%
MATH
76.6%
GPQA
53.6%
MT-Bench
9.0/10
Context Window
128K
Avg Response
850ms
Input Cost / 1M
$5.0
Output Cost / 1M
$15.0
Open source
✗ No
Multimodal
✓ Yes
Overall Winner
GPT-4o 🏆
Leads by 5.1 points (79.5 vs 74.4)