← Back to all models
S

Seed 1.6

by ByteDance
Flagship Paid API Multimodal 🏆 Ranked #31 of 85
77.5
Overall Score
out of 100
About

ByteDance's flagship vision-language model with a 262K context window and strong multimodal capabilities across image understanding and complex reasoning tasks.

Key Metrics
Context Window
262K
tokens
Avg Response
720
milliseconds
Input Cost
$0.9
per million tokens
Output Cost
$3.6
per million tokens
Arena ELO
1270
Chatbot Arena rating
MT-Bench
8.9
out of 10
Benchmark Scores
MMLU
85.0%
HumanEval
84.0%
MATH
80.0%
GPQA
55.0%
MT-Bench
89.0/10
Capability Profile
Strengths & Limitations
Strengths
✓ Vision-language ✓ Large context ✓ Strong multimodal ✓ Fast ✓ ByteDance ecosystem
Limitations
⚠ Less Western ecosystem ⚠ Newer model with limited benchmarks ⚠ Data residency considerations
Ideal Use Cases
Vision analysis Multimodal workflows Content moderation Document understanding Research
Model Details
Provider ByteDance
Released 2025-04-01
Type Paid API
Multimodal Yes
Tier Flagship
Global rank #31 / 85
Pricing (USD)
Input tokens $0.9/M
Output tokens $3.6/M
Per 1,000 tokens ≈ $0.0009 input / $0.0036 output
All Benchmarks
MMLU 85.0%
HumanEval 84.0%
MATH 80.0%
GPQA 55.0%
MT-Bench 8.9/10
Arena ELO 1270
Compare this model View Rankings

You might also consider