Grok 4.1 Fast

by xAI

Efficient Paid API Multimodal 🏆 Ranked #37 of 85

76.9

Overall Score

out of 100

About

xAI's fast vision-language model with a 2M token context window, combining visual reasoning with near real-time response speeds. Built for high-throughput production workloads.

Key Metrics

Context Window

2.0M

tokens

Avg Response

500

milliseconds

Input Cost

$3.0

per million tokens

Output Cost

$15.0

per million tokens

Arena ELO

1270

Chatbot Arena rating

MT-Bench

8.8

out of 10

Benchmark Scores

MMLU

84.0%

HumanEval

84.0%

MATH

78.0%

GPQA

55.0%

MT-Bench

88.0/10

Capability Profile

Strengths & Limitations

Strengths

Limitations

Ideal Use Cases

Model Details

Provider xAI

Released 2025-05-01

Type Paid API

Multimodal Yes

Tier Efficient

Global rank #37 / 85

Pricing (USD)

Input tokens $3.0/M

Output tokens $15.0/M

Per 1,000 tokens ≈ $0.0030 input / $0.0150 output

All Benchmarks

MMLU 84.0%

HumanEval 84.0%

MATH 78.0%

GPQA 55.0%

MT-Bench 8.8/10

Arena ELO 1270

Compare this model View Rankings

Similar Models

You might also consider

OpenAI's compact reasoning model achieving near-o3 performance at a fraction of the cost. o4-mini uses extended chain-of-thought and achieves exceptional results on mathematics, science, and coding — making advanced reasoning economically accessible.

xAI's compact reasoning model offering excellent maths and logic at a fraction of Grok 3's cost. Grok 3 Mini uses chain-of-thought reasoning and real-time X platform data to punch above its size class.

Google DeepMind's latest fast multimodal model with strong reasoning and a 1 million token context window. Bridges the gap between Flash speed and Pro capability, with thinking mode for harder tasks.