← Back to all models
o

o3

by OpenAI
Flagship Proprietary Multimodal 🏆 Ranked #1 of 22
93.7
Overall Score
out of 100
About

OpenAI's most powerful reasoning model, using extended chain-of-thought to tackle the hardest problems in mathematics, science, and coding. o3 sets new standards on GPQA and competitive maths at the cost of higher latency and price.

Key Metrics
Context Window
200K
tokens
Avg Response
4200
milliseconds
Input Cost
$10.0
per million tokens
Output Cost
$40.0
per million tokens
Arena ELO
1391
Chatbot Arena rating
MT-Bench
9.1
out of 10
Benchmark Scores
MMLU
91.6%
HumanEval
96.4%
MATH
97.8%
GPQA
87.7%
MT-Bench
91.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ State-of-the-art reasoning ✓ Exceptional GPQA scores ✓ Best maths performance ✓ Deep thinking capability ✓ Science and coding
Limitations
⚠ Very high cost ⚠ Slower due to extended thinking ⚠ Overkill for conversational tasks
Ideal Use Cases
Frontier research Competitive programming Advanced mathematics Scientific problem solving Complex reasoning chains
Model Details
Provider OpenAI
Released 2025-04-16
Open source No
Multimodal Yes
Tier Flagship
Global rank #1 / 22
Pricing (USD)
Input tokens $10.0/M
Output tokens $40.0/M
Per 1,000 tokens ≈ $0.0100 input / $0.0400 output
All Benchmarks
MMLU 91.6%
HumanEval 96.4%
MATH 97.8%
GPQA 87.7%
MT-Bench 9.1/10
Arena ELO 1391
Compare this model View leaderboard

You might also consider