All Models
Browse all 85 models grouped by provider.
OpenAI
9 modelsOpenAI's most powerful reasoning model, using extended chain-of-thought to tackle the hardest problems in mathematics, science, and coding. o3 sets new standards on GPQA and competitive maths at the cost of higher latency and price.
OpenAI's compact reasoning model achieving near-o3 performance at a fraction of the cost. o4-mini uses extended chain-of-thought and achieves exceptional results on mathematics, science, and coding — making advanced reasoning economically accessible.
OpenAI's flagship reasoning model trained with reinforcement learning to think through complex problems step-by-step before responding. Excels at maths, science, and multi-step logic at the cost of higher latency.
OpenAI's coding-focused flagship model with a 1 million token context window and top-tier performance on software engineering tasks. GPT-4.1 was specifically optimised for instruction following and agentic coding workflows.
OpenAI's 120B open-weight language model, making frontier-class performance available for self-hosted and on-premise deployments. The largest member of OpenAI's open-weights family.
OpenAI's flagship multimodal model combining text, vision, and audio capabilities. GPT-4o delivers state-of-the-art performance across reasoning, coding, and creative tasks whilst offering faster response times than its predecessors.
OpenAI's cost-efficient sibling to GPT-4.1 with a 1 million token context window. GPT-4.1 mini makes strong coding and instruction-following capability available at a low price point, ideal for high-volume developer workflows.
OpenAI's lightweight, cost-efficient model that punches well above its weight class. GPT-4o mini makes advanced AI capabilities accessible for high-volume, cost-sensitive applications without sacrificing too much quality.
OpenAI's smallest and fastest GPT-4.1 model with vision support and a 1M token context window. Optimised for latency-critical and cost-sensitive applications.
Anthropic
11 modelsAnthropic's most powerful and intelligent model, built for the most demanding tasks where quality outweighs cost. Claude Opus 4 leads on complex multi-step reasoning, graduate-level science, and nuanced long-form writing.
Anthropic's hybrid reasoning flagship with a 1M context window, pushing the frontier for coding and agentic tasks. Combines extended thinking with tool use for complex multi-step workflows.
Anthropic's breakthrough model introducing extended thinking — the ability to reason step-by-step before responding. Claude 3.7 Sonnet achieves best-in-class MATH scores and strong coding, making it Anthropic's strongest release at the Sonnet price point.
The latest and most capable Sonnet model to date. Claude Sonnet 4.6 brings further gains in mathematical reasoning and instruction following, making it Anthropic's most well-rounded model at the Sonnet price point.
Anthropic's multimodal Opus model offering seamless vision-language interactions with a 200K context window. Designed for the most demanding tasks requiring deep analysis and image comprehension.
A refined iteration of Claude Sonnet 4 with improved performance on graduate-level reasoning and coding benchmarks. Claude Sonnet 4.5 delivers notably stronger results on GPQA and competitive maths whilst maintaining the same pricing as its predecessor.
Anthropic's vision-capable Opus model combining visual perception with language reasoning across a 200K context window.
Anthropic's fourth-generation Sonnet model, offering a significant leap in reasoning depth and coding accuracy over the 3.x series. Claude Sonnet 4 introduces refined tool use and improved adherence to complex multi-step instructions.
Anthropic's most intelligent model, excelling at complex reasoning and coding tasks. Claude 3.5 Sonnet sets new benchmarks for intelligence whilst maintaining the safety and harmlessness Anthropic is known for.
Anthropic's fastest and most affordable vision model with a 200K context window. Ideal for high-volume tasks requiring speed and vision capability at minimal cost.
Anthropic's fastest and most compact model, designed for near-instant responsiveness in demanding applications. Claude 3 Haiku delivers excellent value for tasks requiring speed at scale whilst maintaining Anthropic's commitment to safety.
xAI
4 modelsxAI's most capable model, trained on a 100,000-GPU cluster and setting new benchmarks in mathematics and scientific reasoning. Grok 3 integrates real-time data from the X platform and leads the Arena ELO leaderboard among commercial models.
xAI's compact reasoning model offering excellent maths and logic at a fraction of Grok 3's cost. Grok 3 Mini uses chain-of-thought reasoning and real-time X platform data to punch above its size class.
xAI's flagship model, built with real-time access to information from the X platform. Grok-2 takes a distinctive approach to AI with a more candid, less filtered personality and strong performance on complex reasoning tasks.
xAI's fast vision-language model with a 2M token context window, combining visual reasoning with near real-time response speeds. Built for high-throughput production workloads.
Google DeepMind
12 modelsGoogle DeepMind's next-generation Pro model with integrated vision understanding and a 1M token context window. Delivers frontier-level reasoning and multimodal analysis.
Google DeepMind's most advanced model, with standout performance in mathematics, science, and long-context reasoning. Gemini 2.5 Pro features a 1 million token context window and an experimental 2 million token mode, alongside strong multimodal capabilities.
Google DeepMind's latest fast multimodal model with strong reasoning and a 1 million token context window. Bridges the gap between Flash speed and Pro capability, with thinking mode for harder tasks.
Google DeepMind's next-generation fast model offering impressive performance at a fraction of the cost. Gemini 2.0 Flash brings multimodal capabilities and a massive context window to real-time applications.
Google DeepMind's flagship open-source model in the Gemma 3 family, capable of running on a single high-end GPU. Gemma 3 27B supports text and images and delivers strong benchmarks across knowledge, reasoning, and instruction following — making it one of the best self-hostable multimodal models available.
Google DeepMind's highly capable multimodal model with a groundbreaking 1 million token context window. Gemini 1.5 Pro excels at long-document analysis, video understanding, and complex cross-modal tasks.
Google DeepMind's capable 12B model from the latest Gemma 3 series. Features multimodal vision capabilities and a 128K context window, outperforming many larger models whilst fitting on a single consumer GPU.
Google DeepMind's fast, cost-efficient multimodal model with a 1 million token context window. Ideal for high-volume applications that need capable reasoning at low latency and minimal cost.
Google DeepMind's 9B open-source model from the Gemma 2 family, using interleaved local and global attention. Gemma 2 9B competes with models twice its size and is one of the best-performing small open-source models available.
Google DeepMind's lightweight 4B model from the Gemma 3 family. Designed to run on phones and laptops, it includes vision understanding and a 128K context window — remarkable capabilities for a sub-5B model.
Google DeepMind's code-specialised 7B model built on Gemma, pre-trained on 500B+ tokens of code. Excels at code completion, generation, and natural language to code conversion.
Google DeepMind's smallest Gemma 3 model at 1B parameters, designed for on-device inference with a 32K context window. Suitable for edge applications where memory is the primary constraint.
DeepSeek
5 modelsDeepSeek's open-source reasoning model trained with reinforcement learning to rival OpenAI's o1. DeepSeek R1 achieves exceptional scores on mathematics and scientific reasoning benchmarks, making advanced chain-of-thought reasoning accessible to everyone.
DeepSeek's updated flagship MoE model with 671B total parameters and improved capability over V3. Balances frontier performance with efficient inference through sparse mixture-of-experts architecture.
DeepSeek's breakthrough open-source model that shocked the AI industry with frontier-level performance at a fraction of the training cost. DeepSeek V3 demonstrates that cutting-edge AI is no longer exclusive to the largest technology companies.
A Llama-3.3-70B model distilled from the full DeepSeek R1 reasoning model, inheriting chain-of-thought reasoning capabilities at a fraction of the compute cost. One of the strongest open-source reasoning models available.
DeepSeek's 236B MoE model merging V2 Chat and Coder capabilities. A strong open-source model for combined reasoning and coding tasks at manageable inference cost.
Alibaba
10 modelsAlibaba's latest flagship open-source model with a unique dual-mode operation: fast standard responses or an extended thinking mode for harder problems. Qwen3 72B achieves frontier-class mathematics and coding scores as a fully open-source model, rivalling paid APIs.
Alibaba's 122B flagship model from the Qwen3.5 series, offering frontier-level coding and technical performance in a large open-weight package.
Alibaba's latest 35B code-specialised model from the Qwen3.5 series, targeting software development and technical reasoning with expertise across major programming languages.
Alibaba's flagship open-source model demonstrating remarkable capability, especially in mathematics and coding. Qwen 2.5 72B offers an outstanding balance of performance and accessibility, making advanced AI widely available.
Alibaba's latest 14B model from the Qwen3 series featuring hybrid thinking mode. Supports seamless switching between deep reasoning and fast response, with state-of-the-art maths and coding for its size.
Alibaba's code-specialised 30B model from the Qwen3 series, with expertise across major programming languages and agentic coding workflows. Leads open-source coding benchmarks at its parameter class.
Alibaba's capable 32B vision-language model combining visual perception with powerful language reasoning. Strong at document understanding, visual maths, and complex image analysis.
Alibaba's mid-size 14B model from the Qwen 2.5 series. Strikes an excellent balance between capability and compute requirements, outperforming many larger models on maths and coding benchmarks.
Alibaba's compact 7B model from the Qwen 2.5 series, punching well above its weight class in mathematics and coding. Runs comfortably on 8GB VRAM with remarkable benchmark scores for its size.
Alibaba's compact 8B vision-language model from the Qwen3 VL family, supporting visual question answering, image description, and multimodal reasoning on consumer-grade hardware.
Mistral AI
12 modelsMistral AI's largest open-weight model at 675B parameters, offering frontier performance in an open-source package. Engineered for fast, responsive interactions at scale.
Mistral AI's large 123B coding model offering frontier code generation and completion across all major programming languages. Built for production software engineering at scale.
Mistral AI's most powerful model, developed with a focus on efficiency and European AI sovereignty. Mistral Large 2 excels at coding tasks and multilingual applications, particularly for European languages.
Mistral AI's compact 24B coding specialist model, designed for intelligent code completion, debugging, and software engineering workflows. Fast enough for interactive development tooling.
Mistral AI's compact yet capable API model designed for cost-effective deployments. Mistral Small 3.1 delivers strong multilingual performance and instruction following at a fraction of the cost of flagship models.
Mistral AI's 14B model from the Ministral 3 series, offering strong performance at a size that fits comfortably on a single 16GB GPU with quantisation.
Mistral AI's Mixture-of-Experts model activating 2 of 8 expert networks per token, matching GPT-3.5 quality at much lower inference cost. A landmark open-source model for quality-efficiency balance.
Mistral AI's mathematics-specialised 7B model, fine-tuned for step-by-step mathematical reasoning and problem solving. Achieves top scores on maths benchmarks for its parameter class.
A compact yet highly capable 12B model developed jointly by Mistral AI and NVIDIA. Uses a 128K context window and a new tokenizer (Tekken) optimised for multilingual content, balancing size and performance.
Mistral AI's efficient 8B model from the Ministral 3 series, balancing speed and capability for production use. Strong instruction following with a 128K context window.
The model that put open-source LLMs on the map. Mistral 7B outperformed Llama 2 13B on most benchmarks at half the parameters, using grouped-query attention and sliding window attention for efficiency.
Mistral AI's ultra-compact 3B model optimised for speed and cost-effectiveness. Ideal for edge deployment and high-volume applications where minimal latency matters most.
Meta
10 modelsMeta's flagship fourth-generation model using a Mixture-of-Experts architecture for efficient high-quality inference. Llama 4 Maverick delivers frontier-class performance as a fully open-source model with a 1 million token context window.
Meta's largest open-source model, competing directly with proprietary frontier models. Llama 3.1 405B can be self-hosted and fine-tuned, offering unmatched flexibility for organisations with data privacy requirements.
Meta's latest 70B model, matching Llama 3.1 405B quality at a fraction of the compute cost. Llama 3.3 70B is the go-to open-source model for users with a single consumer GPU capable of running 70B weights.
Meta's efficient Llama 4 model optimised for speed and cost. Despite being the lighter of the two Llama 4 releases, Scout achieves strong benchmark results and features an extraordinary 10 million token context window — the largest of any model.
Meta's large vision-language model with strong image understanding and text reasoning capabilities. Competes with frontier multimodal models for visual analysis tasks.
Meta's 70B parameter instruction-tuned model from the Llama 3.1 family. A powerful open-source alternative to paid APIs with a large 128K context window and strong multilingual capabilities.
Meta's compact vision-language model supporting image understanding and multimodal conversations. Runs on a single consumer GPU with 12GB VRAM and offers solid visual question answering capabilities.
Meta's lightweight 8B model from the Llama 3.1 family. The most accessible large language model for consumer hardware — runs on a laptop GPU with 8GB VRAM whilst punching well above its weight class.
Meta's ultra-compact 3B model designed for edge and on-device deployment. Llama 3.2 3B runs entirely on CPU or low-end GPUs with surprisingly capable text understanding for its size.
Meta's smallest Llama model, designed for on-device and embedded deployments. Llama 3.2 1B runs entirely on CPU and low-power devices with a 128K context window despite its tiny footprint.
ByteDance
1 modelMicrosoft
2 modelsMicrosoft's 14-billion parameter model that challenges models three times its size. Phi-4 was trained on curated high-quality synthetic data, achieving remarkable mathematics and science benchmark scores and demonstrating that data quality can outperform raw scale.
Microsoft's 3.8B parameter model trained on high-quality synthetic data. Phi-3.5 Mini delivers remarkable performance for its tiny size, competing with models 3x larger on reasoning and coding tasks.
DeepCogito
1 modelNVIDIA
1 modelZhipu AI
1 modelAllenAI
2 modelsAllen Institute for AI's 32B open language model with full training transparency. Delivers excellent results for everyday tasks with the gold standard of open science — open weights, data, and code.
Allen Institute for AI's open 7B language model built with full transparency — open data, open training code, open weights. A benchmark for truly open AI research.
IBM
2 modelsIBM's largest Granite code model trained on 116 programming languages. Designed for enterprise software development with a focus on code generation, explanation, and bug fixing.
IBM's compact 8B code model supporting 116 programming languages. Fast and efficient for interactive coding assistance and integration into development pipelines.