← Back to all models
G

Gemma 3 12B

by Google DeepMind
Efficient Free & Open Source Multimodal 🏆 Ranked #57 of 85
67.6
Overall Score
out of 100
About

Google DeepMind's capable 12B model from the latest Gemma 3 series. Features multimodal vision capabilities and a 128K context window, outperforming many larger models whilst fitting on a single consumer GPU.

Key Metrics
Context Window
128K
tokens
Avg Response
560
milliseconds
Input Cost
$0.09
per million tokens
Output Cost
$0.09
per million tokens
Arena ELO
1200
Chatbot Arena rating
MT-Bench
8.5
out of 10
Benchmark Scores
MMLU
78.0%
HumanEval
76.0%
MATH
72.0%
GPQA
38.0%
MT-Bench
85.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Multimodal vision ✓ Large context ✓ Open source ✓ Strong benchmark scores ✓ Efficient
Limitations
⚠ Requires 12GB VRAM ⚠ Less capable than 27B ⚠ Newer with fewer fine-tunes
Ideal Use Cases
Vision tasks Document analysis Code assistance Personal AI Research
Model Details
Provider Google DeepMind
Released 2025-03-12
Type Free & Open Source
Multimodal Yes
Tier Efficient
Global rank #57 / 85
Pricing (USD)
Input tokens $0.09/M
Output tokens $0.09/M
Per 1,000 tokens ≈ $0.0001 input / $0.0001 output
All Benchmarks
MMLU 78.0%
HumanEval 76.0%
MATH 72.0%
GPQA 38.0%
MT-Bench 8.5/10
Arena ELO 1200
Compare this model View Rankings

You might also consider