All Models

Browse all 22 models grouped by provider.

Use case
Tier
Provider
22 models across 9 providers
O

OpenAI

4 models
Multimodal
Best score 93.7
o
o3
OpenAI
93.7
score
Flagship Multimodal Top ranked

OpenAI's most powerful reasoning model, using extended chain-of-thought to tackle the hardest problems in mathematics, science, and coding. o3 sets new standards on GPQA and competitive maths at the cost of higher latency and price.

Context window
200K
Avg response
4200ms
Input / 1M tokens
$10.0
Arena ELO
1391
G
GPT-4.1
OpenAI
85.2
score
Flagship Multimodal

OpenAI's coding-focused flagship model with a 1 million token context window and top-tier performance on software engineering tasks. GPT-4.1 was specifically optimised for instruction following and agentic coding workflows.

Context window
1.0M
Avg response
880ms
Input / 1M tokens
$2.0
Arena ELO
1340
G
GPT-4o
OpenAI
79.5
score
Flagship Multimodal

OpenAI's flagship multimodal model combining text, vision, and audio capabilities. GPT-4o delivers state-of-the-art performance across reasoning, coding, and creative tasks whilst offering faster response times than its predecessors.

Context window
128K
Avg response
850ms
Input / 1M tokens
$5.0
Arena ELO
1285
G
GPT-4o mini
OpenAI
70.0
score
Efficient Multimodal

OpenAI's lightweight, cost-efficient model that punches well above its weight class. GPT-4o mini makes advanced AI capabilities accessible for high-volume, cost-sensitive applications without sacrificing too much quality.

Context window
128K
Avg response
420ms
Input / 1M tokens
$0.15
Arena ELO
1179
A

Anthropic

6 models
Multimodal
Best score 90.7
C
Claude Opus 4
Anthropic
90.7
score
Flagship Multimodal Top ranked

Anthropic's most powerful and intelligent model, built for the most demanding tasks where quality outweighs cost. Claude Opus 4 leads on complex multi-step reasoning, graduate-level science, and nuanced long-form writing.

Context window
200K
Avg response
1100ms
Input / 1M tokens
$15.0
Arena ELO
1395
C
Claude Sonnet 4.6
Anthropic
88.3
score
Flagship Multimodal Top ranked

The latest and most capable Sonnet model to date. Claude Sonnet 4.6 brings further gains in mathematical reasoning and instruction following, making it Anthropic's most well-rounded model at the Sonnet price point.

Context window
200K
Avg response
790ms
Input / 1M tokens
$3.0
Arena ELO
1374
C
Claude Sonnet 4.5
Anthropic
86.9
score
Flagship Multimodal

A refined iteration of Claude Sonnet 4 with improved performance on graduate-level reasoning and coding benchmarks. Claude Sonnet 4.5 delivers notably stronger results on GPQA and competitive maths whilst maintaining the same pricing as its predecessor.

Context window
200K
Avg response
800ms
Input / 1M tokens
$3.0
Arena ELO
1362
C
Claude Sonnet 4
Anthropic
85.4
score
Flagship Multimodal

Anthropic's fourth-generation Sonnet model, offering a significant leap in reasoning depth and coding accuracy over the 3.x series. Claude Sonnet 4 introduces refined tool use and improved adherence to complex multi-step instructions.

Context window
200K
Avg response
820ms
Input / 1M tokens
$3.0
Arena ELO
1345
C
Claude 3.5 Sonnet
Anthropic
80.1
score
Flagship Multimodal

Anthropic's most intelligent model, excelling at complex reasoning and coding tasks. Claude 3.5 Sonnet sets new benchmarks for intelligence whilst maintaining the safety and harmlessness Anthropic is known for.

Context window
200K
Avg response
780ms
Input / 1M tokens
$3.0
Arena ELO
1289
C
Claude 3 Haiku
Anthropic
63.2
score
Efficient Multimodal

Anthropic's fastest and most compact model, designed for near-instant responsiveness in demanding applications. Claude 3 Haiku delivers excellent value for tasks requiring speed at scale whilst maintaining Anthropic's commitment to safety.

Context window
200K
Avg response
380ms
Input / 1M tokens
$0.25
Arena ELO
1168