← Back to all models
G

Gemma 3 4B

by Google DeepMind
Efficient Free & Open Source Multimodal 🏆 Ranked #77 of 85
58.1
Overall Score
out of 100
About

Google DeepMind's lightweight 4B model from the Gemma 3 family. Designed to run on phones and laptops, it includes vision understanding and a 128K context window — remarkable capabilities for a sub-5B model.

Key Metrics
Context Window
128K
tokens
Avg Response
220
milliseconds
Input Cost
$0.03
per million tokens
Output Cost
$0.03
per million tokens
Arena ELO
1160
Chatbot Arena rating
MT-Bench
7.9
out of 10
Benchmark Scores
MMLU
68.0%
HumanEval
64.0%
MATH
62.0%
GPQA
30.0%
MT-Bench
79.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Runs on 4GB VRAM ✓ Vision capable ✓ Large context ✓ Open source ✓ Very fast
Limitations
⚠ Less capable than larger models ⚠ Limited complex reasoning ⚠ Smaller knowledge base
Ideal Use Cases
Mobile AI Edge deployment Personal assistants Quick summarisation Lightweight vision
Model Details
Provider Google DeepMind
Released 2025-03-12
Type Free & Open Source
Multimodal Yes
Tier Efficient
Global rank #77 / 85
Pricing (USD)
Input tokens $0.03/M
Output tokens $0.03/M
Per 1,000 tokens ≈ $0.0000 input / $0.0000 output
All Benchmarks
MMLU 68.0%
HumanEval 64.0%
MATH 62.0%
GPQA 30.0%
MT-Bench 7.9/10
Arena ELO 1160
Compare this model View Rankings

You might also consider