← Back to all models
P

Phi-4

by Microsoft
Efficient Free & Open Source 🏆 Ranked #34 of 85
77.4
Overall Score
out of 100
About

Microsoft's 14-billion parameter model that challenges models three times its size. Phi-4 was trained on curated high-quality synthetic data, achieving remarkable mathematics and science benchmark scores and demonstrating that data quality can outperform raw scale.

Key Metrics
Context Window
16K
tokens
Avg Response
480
milliseconds
Input Cost
$0.0
per million tokens
Output Cost
$0.0
per million tokens
Arena ELO
1280
Chatbot Arena rating
MT-Bench
8.6
out of 10
Benchmark Scores
MMLU
84.8%
HumanEval
82.6%
MATH
80.4%
GPQA
56.1%
MT-Bench
86.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Exceptional maths for size ✓ Open source ✓ Low resource requirements ✓ Strong GPQA ✓ Fast inference
Limitations
⚠ Smaller 16K context window ⚠ Weaker on long-form tasks ⚠ Less training diversity
Ideal Use Cases
Consumer hardware AI Mathematics tutoring Edge deployment Research Resource-constrained environments
Model Details
Provider Microsoft
Released 2024-12-12
Type Free & Open Source
Multimodal No
Tier Efficient
Global rank #34 / 85
Pricing (USD)
Input tokens $0.0/M
Output tokens $0.0/M
Per 1,000 tokens ≈ $0.0000 input / $0.0000 output
All Benchmarks
MMLU 84.8%
HumanEval 82.6%
MATH 80.4%
GPQA 56.1%
MT-Bench 8.6/10
Arena ELO 1280
Compare this model View Rankings

You might also consider