Gemma 3 12B

by Google DeepMind

Efficient Free & Open Source Multimodal 🏆 Ranked #57 of 85

67.6

Overall Score

out of 100

About

Google DeepMind's capable 12B model from the latest Gemma 3 series. Features multimodal vision capabilities and a 128K context window, outperforming many larger models whilst fitting on a single consumer GPU.

Key Metrics

Context Window

128K

tokens

Avg Response

560

milliseconds

Input Cost

$0.09

per million tokens

Output Cost

$0.09

per million tokens

Arena ELO

1200

Chatbot Arena rating

MT-Bench

8.5

out of 10

Benchmark Scores

MMLU

78.0%

HumanEval

76.0%

MATH

72.0%

GPQA

38.0%

MT-Bench

85.0/10

Capability Profile

Strengths & Limitations

Strengths

Limitations

Ideal Use Cases

Model Details

Provider Google DeepMind

Released 2025-03-12

Type Free & Open Source

Multimodal Yes

Tier Efficient

Global rank #57 / 85

Pricing (USD)

Input tokens $0.09/M

Output tokens $0.09/M

Per 1,000 tokens ≈ $0.0001 input / $0.0001 output

All Benchmarks

MMLU 78.0%

HumanEval 76.0%

MATH 72.0%

GPQA 38.0%

MT-Bench 8.5/10

Arena ELO 1200

Compare this model View Rankings

Similar Models

You might also consider

OpenAI's compact reasoning model achieving near-o3 performance at a fraction of the cost. o4-mini uses extended chain-of-thought and achieves exceptional results on mathematics, science, and coding — making advanced reasoning economically accessible.

xAI's compact reasoning model offering excellent maths and logic at a fraction of Grok 3's cost. Grok 3 Mini uses chain-of-thought reasoning and real-time X platform data to punch above its size class.

Google DeepMind's latest fast multimodal model with strong reasoning and a 1 million token context window. Bridges the gap between Flash speed and Pro capability, with thinking mode for harder tasks.