Model Leaderboard

Click any column heading to re-sort the table. Green indicates a winner in that metric, lower cost and faster response are highlighted as advantages.

Model Releases by Provider & Month
Sort by: Overall Score Human Votes Knowledge Coding Maths Science Memory Speed Tokens/sec Lowest Price
# Model Score ELO MMLU Code Maths Ctx ms TPS Cost/M Type
1
o
o3
OpenAI
93.7 1391 91.6% 96.4% 97.8% 200K 4200ms 15 t/s $10.0
Paid Vision
2
o
o4-mini
OpenAI
91.5 1370 90.8% 95.8% 95.9% 200K 2200ms 80 t/s $1.1
Paid Vision
3
C
Claude Opus 4
Anthropic
90.7 1395 93.2% 95.6% 86.0% 200K 1100ms 30 t/s $15.0
Paid Vision
4
G
Grok 3
xAI
90.6 1402 93.3% 91.8% 93.3% 131K 900ms 65 t/s $3.0
Paid Vision
5
o
o1
OpenAI
89.8 1350 92.3% 92.4% 94.8% 128K 4200ms 28 t/s $15.0
Paid
6
C
Claude Opus 4.6
Anthropic
89.3 1360 92.0% 93.0% 90.0% 1.0M 1100ms 42 t/s $18.0
Paid Vision
7
G
Gemini 3 Pro
Google DeepMind
88.8 1355 92.0% 91.0% 90.0% 1.0M 820ms 62 t/s $7.0
Paid Vision
8
C
Claude 3.7 Sonnet
Anthropic
88.6 1355 90.7% 93.0% 96.2% 200K 850ms 80 t/s $3.0
Paid Vision
9
C
Claude Sonnet 4.6
Anthropic
88.3 1374 92.1% 94.8% 83.7% 200K 790ms 85 t/s $3.0
Paid Vision
10
D
DeepSeek R1
DeepSeek
88.2 1358 90.8% 92.1% 97.3% 64K 2800ms 20 t/s Free
Free
11
C
Claude Opus 4.5
Anthropic
87.8 1345 91.5% 91.5% 89.0% 200K 950ms 46 t/s $15.0
Paid Vision
12
C
Claude Sonnet 4.5
Anthropic
86.9 1362 91.7% 94.2% 81.5% 200K 800ms 85 t/s $3.0
Paid Vision
13
C
Claude Opus 4.1
Anthropic
86.9 1340 91.0% 91.0% 88.0% 200K 900ms 48 t/s $15.0
Paid Vision
14
G
Gemini 2.5 Pro
Google DeepMind
85.9 1380 90.0% 87.9% 91.2% 1.0M 1050ms 120 t/s $1.25
Paid Vision
15
C
Claude Sonnet 4
Anthropic
85.4 1345 91.0% 93.5% 79.2% 200K 820ms 90 t/s $3.0
Paid Vision
16
G
GPT-4.1
OpenAI
85.2 1340 90.2% 97.1% 86.5% 1.0M 880ms 75 t/s $2.0
Paid Vision
17
G
Grok 3 Mini
xAI
85.0 1340 87.5% 88.0% 89.4% 131K 560ms 150 t/s $0.3
Paid
18
Q
Qwen3 72B
Alibaba
84.9 1320 88.0% 92.5% 90.0% 128K 900ms 50 t/s Free
Free
19
D
DeepSeek V3.1 671B
DeepSeek
83.6 1310 89.0% 91.0% 87.0% 128K 1100ms 20 t/s Free
Free
20
Q
Qwen3.5 122B
Alibaba
82.2 1280 87.0% 94.0% 89.0% 128K 1400ms 22 t/s Free
Free
21
M
Mistral Large 3 675B
Mistral AI
82.1 1295 88.0% 90.0% 85.0% 128K 1200ms 18 t/s Free
Free
22
G
Gemini 2.5 Flash
Google DeepMind
81.7 1300 86.0% 89.0% 84.0% 1.0M 540ms 190 t/s $0.15
Paid Vision
23
D
DeepSeek V3
DeepSeek
81.0 1302 88.5% 89.1% 87.2% 128K 680ms 60 t/s Free
Free
24
G
GPT-OSS 120B
OpenAI
80.9 1285 87.0% 90.0% 84.0% 128K 1300ms 25 t/s Free
Free
25
C
Claude 3.5 Sonnet
Anthropic
80.1 1289 88.7% 92.0% 71.1% 200K 780ms 75 t/s $3.0
Paid Vision
26
G
GPT-4o
OpenAI
79.5 1285 88.7% 90.2% 76.6% 128K 850ms 55 t/s $5.0
Paid Vision
27
L
Llama 4 Maverick
Meta
78.9 1285 88.7% 89.8% 74.9% 1.0M 1150ms 25 t/s Free
Free Vision
28
D
DeepSeek R1 Distill 70B
DeepSeek
78.5 1250 84.0% 86.0% 90.0% 128K 1800ms 40 t/s Free
Free
29
Q
Qwen3.5 35B
Alibaba
77.9 1245 83.0% 92.0% 86.0% 128K 850ms 55 t/s Free
Free
30
L
Llama 3.1 405B
Meta
77.6 1266 88.6% 89.0% 73.8% 128K 1200ms 15 t/s Free
Free
31
S
Seed 1.6
ByteDance
77.5 1270 85.0% 84.0% 80.0% 262K 720ms 95 t/s $0.9
Paid Vision
32
D
Devstral 2 123B
Mistral AI
77.5 1255 82.0% 94.0% 79.0% 128K 1500ms 22 t/s Free
Free
33
Q
Qwen 2.5 72B
Alibaba
77.4 1259 86.0% 86.6% 83.1% 128K 750ms 60 t/s Free
Free
34
P
Phi-4
Microsoft
77.4 1280 84.8% 82.6% 80.4% 16K 480ms 90 t/s Free
Free
35
G
Grok-2
xAI
77.3 1248 87.5% 88.4% 76.1% 131K 890ms 80 t/s $2.0
Paid Vision
36
L
Llama 3.3 70B
Meta
77.0 1256 86.0% 88.0% 77.0% 128K 1100ms 45 t/s Free
Free
37
G
Grok 4.1 Fast
xAI
76.9 1270 84.0% 84.0% 78.0% 2.0M 500ms 140 t/s $3.0
Paid Vision
38
Q
Qwen3 14B
Alibaba
76.3 1230 82.0% 87.0% 88.0% 128K 680ms 80 t/s Free
Free
39
D
DeepSeek V2.5 236B
DeepSeek
76.2 1268 80.4% 89.0% 75.7% 128K 1300ms 25 t/s Free
Free
40
G
Gemini 2.0 Flash
Google DeepMind
75.7 1252 85.0% 87.4% 73.0% 1.0M 520ms 250 t/s $0.1
Paid Vision
41
G
Gemma 3 27B
Google DeepMind
75.6 1290 87.5% 77.2% 72.0% 128K 980ms 45 t/s Free
Free Vision
42
Q
Qwen3 Coder 30B
Alibaba
75.2 1240 78.0% 93.0% 82.0% 128K 800ms 60 t/s Free
Free
43
G
GPT-4.1 mini
OpenAI
74.9 1230 83.5% 90.0% 79.8% 1.0M 430ms 180 t/s $0.4
Paid Vision
44
Q
Qwen3 VL 32B
Alibaba
74.6 1225 81.0% 84.0% 85.0% 128K 820ms 55 t/s Free
Free Vision
45
G
Gemini 1.5 Pro
Google DeepMind
74.4 1266 85.9% 84.1% 67.7% 1.0M 920ms 70 t/s $3.5
Paid Vision
46
L
Llama 4 Scout
Meta
74.1 1248 87.1% 86.5% 67.4% 10.0M 680ms 80 t/s Free
Free Vision
47
M
Mistral Large 2
Mistral AI
73.9 1232 84.0% 92.0% 69.3% 128K 650ms 65 t/s Free
Free
48
C
Cogito 70B
DeepCogito
72.9 1230 82.0% 82.0% 75.0% 128K 1100ms 42 t/s Free
Free
49
Q
Qwen 2.5 14B
Alibaba
72.6 1210 79.5% 86.0% 83.0% 128K 620ms 85 t/s Free
Free
50
L
Llama 3.2 Vision 90B
Meta
71.6 1228 83.0% 81.0% 69.0% 128K 1100ms 38 t/s Free
Free Vision
51
L
Llama 3.1 70B
Meta
71.2 1220 83.6% 80.5% 66.4% 128K 1200ms 48 t/s Free
Free
52
G
GPT-4o mini
OpenAI
70.0 1179 82.0% 87.2% 70.2% 128K 420ms 130 t/s $0.15
Paid Vision
53
C
Claude Haiku 4.5
Anthropic
69.5 1230 78.0% 78.0% 68.0% 200K 380ms 160 t/s $0.8
Paid Vision
54
N
Nemotron 3 Nano 30B
NVIDIA
69.0 1210 78.0% 80.0% 72.0% 128K 680ms 65 t/s Free
Free
55
Q
Qwen 2.5 7B
Alibaba
68.7 1185 74.2% 84.5% 80.0% 128K 380ms 140 t/s Free
Free
56
D
Devstral Small 2 24B
Mistral AI
67.9 1195 74.0% 88.0% 68.0% 128K 600ms 75 t/s Free
Free
57
G
Gemma 3 12B
Google DeepMind
67.6 1200 78.0% 76.0% 72.0% 128K 560ms 85 t/s Free
Free Vision
58
M
Minimax M2.1
MiniMax
67.3 1215 78.0% 74.0% 70.0% 256K 450ms 130 t/s $0.2
Paid
59
M
Mistral Small 3.1
Mistral AI
66.4 1198 81.0% 74.5% 67.0% 128K 410ms 140 t/s $0.1
Paid Vision
60
G
Gemini 1.5 Flash
Google DeepMind
66.0 1226 78.9% 71.5% 58.5% 1.0M 480ms 210 t/s $0.075
Paid Vision
61
G
GLM 4.7
Zhipu AI
66.0 1200 75.0% 76.0% 70.0% 128K 580ms 95 t/s Free
Free
62
G
GPT-4.1 Nano
OpenAI
65.8 1210 75.0% 76.0% 65.0% 1.0M 320ms 250 t/s $0.1
Paid Vision
63
Q
Qwen3 VL 8B
Alibaba
64.8 1175 72.0% 78.0% 72.0% 32K 500ms 100 t/s Free
Free Vision
64
M
Ministral 3 14B
Mistral AI
64.0 1185 73.0% 78.0% 62.0% 128K 550ms 95 t/s Free
Free
65
O
OLMo 3 32B
AllenAI
64.0 1195 75.0% 74.0% 62.0% 4K 750ms 60 t/s Free
Free
66
C
Claude 3 Haiku
Anthropic
63.2 1168 75.2% 75.9% 60.4% 200K 380ms 140 t/s $0.25
Paid Vision
67
M
Mixtral 8x7B
Mistral AI
63.0 1191 70.6% 75.1% 58.0% 32K 700ms 55 t/s Free
Free
68
P
Phi-3.5 Mini
Microsoft
61.9 1150 69.0% 78.0% 69.0% 128K 200ms 250 t/s Free
Free
69
G
Gemma 2 9B
Google DeepMind
61.7 1190 71.3% 71.0% 58.0% 8K 450ms 110 t/s Free
Free
70
C
Command R+
Cohere
61.6 1155 75.7% 69.6% 56.7% 128K 720ms 55 t/s $2.5
Paid
71
M
Mathstral 7B
Mistral AI
61.4 1165 64.0% 60.0% 86.0% 32K 400ms 140 t/s Free
Free
72
L
Llama 3.2 Vision 11B
Meta
61.2 1175 73.0% 72.0% 58.0% 128K 580ms 90 t/s Free
Free Vision
73
M
Mistral Nemo 12B
Mistral AI
60.9 1180 68.0% 75.0% 55.0% 128K 580ms 95 t/s Free
Free
74
G
Granite Code 34B
IBM
60.6 1180 60.0% 86.0% 56.0% 8K 880ms 55 t/s Free
Free
75
L
Llama 3.1 8B
Meta
60.5 1170 73.0% 72.6% 51.9% 128K 400ms 120 t/s Free
Free
76
M
Ministral 3 8B
Mistral AI
58.2 1155 67.0% 74.0% 52.0% 128K 380ms 160 t/s Free
Free
77
G
Gemma 3 4B
Google DeepMind
58.1 1160 68.0% 64.0% 62.0% 128K 220ms 220 t/s Free
Free Vision
78
C
CodeGemma 7B
Google DeepMind
55.4 1145 54.0% 82.0% 50.0% 8K 380ms 140 t/s Free
Free
79
M
Mistral 7B
Mistral AI
54.8 1141 64.2% 73.0% 40.5% 32K 320ms 150 t/s Free
Free
80
O
OLMo 3 7B
AllenAI
53.4 1140 65.0% 64.0% 45.0% 4K 400ms 130 t/s Free
Free
81
G
Granite Code 8B
IBM
50.3 1130 51.0% 75.0% 40.0% 4K 380ms 140 t/s Free
Free
82
L
Llama 3.2 3B
Meta
49.8 1120 63.4% 58.0% 40.0% 128K 180ms 280 t/s Free
Free
83
M
Ministral 3 3B
Mistral AI
48.5 1115 61.0% 55.0% 42.0% 128K 160ms 320 t/s Free
Free
84
L
Llama 3.2 1B
Meta
35.8 1070 49.3% 38.0% 25.0% 128K 80ms 550 t/s Free
Free
85
G
Gemma 3 1B
Google DeepMind
35.5 1050 44.0% 40.0% 32.0% 32K 90ms 480 t/s Free
Free
Overall Scores
Arena ELO Ratings
MATH Benchmark
HumanEval (Coding)
Legend: ■ Green = best in that column  ·  Free = free & open-source model (self-hosted)  ·  Paid = paid API subscription  ·  Vision = handles images  ·  Speed and price columns: lower is better.