← Back to all models
L

Llama 3.2 3B

by Meta
Efficient Free & Open Source 🏆 Ranked #82 of 85
49.8
Overall Score
out of 100
About

Meta's ultra-compact 3B model designed for edge and on-device deployment. Llama 3.2 3B runs entirely on CPU or low-end GPUs with surprisingly capable text understanding for its size.

Key Metrics
Context Window
128K
tokens
Avg Response
180
milliseconds
Input Cost
$0.015
per million tokens
Output Cost
$0.025
per million tokens
Arena ELO
1120
Chatbot Arena rating
MT-Bench
7.6
out of 10
Benchmark Scores
MMLU
63.4%
HumanEval
58.0%
MATH
40.0%
GPQA
24.0%
MT-Bench
76.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Runs on CPU ✓ Tiny memory footprint ✓ Very fast ✓ Open source ✓ Edge deployment
Limitations
⚠ Limited reasoning ⚠ Smaller knowledge ⚠ Not suitable for complex tasks
Ideal Use Cases
Mobile AI Edge devices Offline assistants Embedded applications Lightweight summarisation
Model Details
Provider Meta
Released 2024-09-25
Type Free & Open Source
Multimodal No
Tier Efficient
Global rank #82 / 85
Pricing (USD)
Input tokens $0.015/M
Output tokens $0.025/M
Per 1,000 tokens ≈ $0.0000 input / $0.0000 output
All Benchmarks
MMLU 63.4%
HumanEval 58.0%
MATH 40.0%
GPQA 24.0%
MT-Bench 7.6/10
Arena ELO 1120
Compare this model View Rankings

You might also consider