← Back to all models
G

Gemini 1.5 Flash

by Google DeepMind
Efficient Paid API Multimodal 🏆 Ranked #60 of 85
66.0
Overall Score
out of 100
About

Google DeepMind's fast, cost-efficient multimodal model with a 1 million token context window. Ideal for high-volume applications that need capable reasoning at low latency and minimal cost.

Key Metrics
Context Window
1.0M
tokens
Avg Response
480
milliseconds
Input Cost
$0.075
per million tokens
Output Cost
$0.3
per million tokens
Arena ELO
1226
Chatbot Arena rating
MT-Bench
8.5
out of 10
Benchmark Scores
MMLU
78.9%
HumanEval
71.5%
MATH
58.5%
GPQA
39.5%
MT-Bench
85.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Very fast ✓ Low cost ✓ Large context window ✓ Multimodal ✓ High throughput
Limitations
⚠ Less capable than Pro ⚠ Shallower reasoning ⚠ Variable quality
Ideal Use Cases
Chatbots Summarisation High-volume workflows Real-time analysis Document Q&A
Model Details
Provider Google DeepMind
Released 2024-05-14
Type Paid API
Multimodal Yes
Tier Efficient
Global rank #60 / 85
Pricing (USD)
Input tokens $0.075/M
Output tokens $0.3/M
Per 1,000 tokens ≈ $0.0001 input / $0.0003 output
All Benchmarks
MMLU 78.9%
HumanEval 71.5%
MATH 58.5%
GPQA 39.5%
MT-Bench 8.5/10
Arena ELO 1226
Compare this model View Rankings

You might also consider