Codestral 2 (Mistral)
B Tier · 7.5/10
Mistral's dedicated code model -- Codestral 2 (launched 2026-04-08) relicensed under Apache 2.0, removing the commercial-use restrictions of the original. 22B dense, strong FIM (fill-in-middle), available via Mistral API + Hugging Face
Score Breakdown
The Good and the Bad
What we like
- +Relicensing to Apache 2.0 is the real news -- the original Codestral required a Mistral Non-Production license for any commercial use, which blocked adoption in-product. Codestral 2 is immediately usable in commercial IDEs, coding assistants, and CI tooling
- +FIM (fill-in-middle) performance is class-leading for open models -- purpose-built for IDE autocomplete in a way that general-purpose models (Llama, DeepSeek V3) aren't. Competitive with GitHub Copilot's underlying model for inline completions
- +22B dense (not MoE) means predictable VRAM requirements and throughput -- easier to deploy than DeepSeek's 671B MoE or Qwen's sparse 35B-A3B for teams that want certainty
- +Available via Mistral's EU-hosted API for customers who need GDPR-native inference -- rare combination of 'open weights + EU vendor' in the code-model category
What could be better
- −22B parameters put it behind frontier closed models (Claude Opus 4.7, GPT-5.4, Gemini 3.1 Pro) on complex multi-file reasoning and agentic coding. This is a fast, cheap inline-completion model, not a frontier coding agent
- −No multimodal or tool-use baked in -- if your workflow needs screenshot-to-code or terminal tool execution, Claude Code, Cursor Composer 2, or Devin cover that ground better
- −Benchmark transparency could be stronger -- Mistral publishes MBPP / HumanEval numbers but third-party SWE-bench or LiveCodeBench verification is thinner than for DeepSeek, Qwen Coder, or the frontier models
- −SWE-bench Verified performance trails the top open-weight coding specialists (Qwen Coder 3.5, DeepSeek V3 Coder variants) by several points in independent testing
Pricing
Open weights (Apache 2.0)
- ✓22B dense model on Hugging Face
- ✓Commercial use allowed (new in Codestral 2; original Codestral required Mistral Non-Production license)
- ✓Self-host on your own infrastructure
- ✓Fine-tune without license fees
Mistral La Plateforme (hosted API)
- ✓Pay-as-you-go API access
- ✓FIM (fill-in-middle) endpoint for IDE autocomplete
- ✓Chat + completion endpoints
- ✓Consistent with Mistral Small/Medium tier pricing
Self-hosted (Hardware)
- ✓Min: 48 GB VRAM (1x RTX 6000 Ada or 2x RTX 3090 with tensor parallelism)
- ✓Mid: 1x H100 80GB for production throughput
- ✓Max: 2x H100 for batched serving + low latency
- ✓Quantized (GGUF Q4_K_M) runs on a 24GB card for experimentation
System Requirements
Hardware needed to self-host. Min = smallest viable setup (usually heavy quantization). Max = full-precision / production-grade.
| Model variant | Min | Max |
|---|---|---|
| Codestral 2 22B dense (Apache 2.0)Apache 2.0 commercial use OK. Original Codestral (2024) still under Mistral Non-Production License -- verify you are on Codestral 2. | 48 GB VRAM -- 1x RTX 6000 Ada or 2x RTX 3090 tensor parallel (or quantized GGUF Q4_K_M on a 24GB card) | 1x H100 80GB for production FP16 throughput; 2x H100 for batched serving |
Known Issues
- Codestral 2 is Apache 2.0, but the ORIGINAL Codestral (2024) is still under Mistral Non-Production License -- if you pulled older weights before 2026-04-08, verify you're on Codestral 2 before shipping commercial useSource: Mistral release notes · 2026-04
- EU-hosted API infrastructure can have higher latency than US-based DeepSeek or GitHub Copilot backends for North American developersSource: Developer reports on Mistral Discord · 2026-04
Best for
Developers and teams who want a legally-clean open-weights code model they can self-host OR hit via API, particularly those with EU data-residency requirements. Ideal for building in-house IDE extensions, code-review bots, or CI/CD AI integrations where the Apache 2.0 license removes procurement friction.
Not for
Developers who want frontier-quality agentic coding -- Cursor Composer 2, Claude Code, or Devin will outperform on complex multi-file tasks. Also not ideal if you only need hosted inference and don't care about self-hosting -- DeepSeek V3.2 and Qwen3.6-Plus offer stronger benchmarks at competitive pricing.
Our Verdict
Codestral 2's Apache 2.0 relicensing is the biggest licensing unlock in open-source coding models since Meta released Llama 2 commercially. The model itself is solid-not-frontier (22B dense, fast, predictable), but the license change is what matters -- teams that couldn't touch the original Codestral because of commercial restrictions can now ship it in products. For IDE-style inline autocomplete on owned infrastructure, or for EU-data-residency use cases, this is now a first-tier option. For agentic or frontier coding work, keep using Claude Opus 4.7 via Claude Code or Composer 2 in Cursor.
Sources
- Mistral news (accessed 2026-04-18)
- ReleaseBot: Mistral updates (accessed 2026-04-18)
- fazm.ai: April 2026 open model releases (accessed 2026-04-18)
Explore more Codestral 2 (Mistral) rankings
Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Codestral 2 (Mistral).
The Tier List Tuesday
Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.
Alternatives to Codestral 2 (Mistral)
GitHub Copilot
AI code assistant that lives in your editor -- autocomplete on steroids. Usage-based billing went LIVE 2026-06-01: AI Credits + token metering across all plans, code completions still free. New Copilot Max tier added the same day. New signups for Student/Pro/Pro+/Max remain PAUSED. As of 2026-06-02 (Microsoft Build), Microsoft's own MAI-Code-1-Flash is rolling into the VS Code model picker
Cursor
AI-native code editor, agent-first in Cursor 3 -- multi-workspace, cross-platform agents, and Composer 2.5 (shipped 2026-05-18, Cursor's frontier coding model at $0.50/$2.50 per 1M tokens, 2x usage during launch week)
Devin Desktop (formerly Windsurf)
Windsurf is now **Devin Desktop** -- Cognition retired the Windsurf brand via OTA update on June 2, 2026. Same editor, plans, pricing, settings, and extensions; the bundled agent is now 'Devin Local' and Devin Cloud agent access starts on the $20 Pro plan. Agent Command Center, Spaces, and Devin Review all carry over
Tabnine
AI code completion that runs locally and keeps your code private -- the enterprise-friendly alternative to Copilot
Claude Code
Anthropic's terminal-based coding agent that reads your whole repo and makes real changes -- not just suggestions. v2.1.131 (2026-05-06 Code with Claude conf) shipped Code Review GA + Remote Agents + CI Auto-Fix + Routines, plus 2x rate-limit increase from the SpaceX compute deal
Lovable
Describe the app you want in plain English and watch it build itself -- 8M users and $400M+ ARR say it works
Devin
The most autonomous AI coding agent -- now a full product family: Devin Cloud, **Devin Desktop (the renamed Windsurf IDE, June 2 2026)**, and Devin Review. Cognition raised $1B+ at a $26B valuation (May 27). Recent shipments: Claude Fable 5 support day-one (6/9), Auto-Triage (5/18), Windows VMs (5/21), Android Emulator (5/13)
Replit
Cloud IDE with an AI agent that can build full apps from prompts. **Agent 4 shipped May 2026** with parallel task execution (Replit reports automatic merge-conflict resolution ~90% of the time) -- coding optional, but recommended
Codex (OpenAI)
OpenAI's cloud-based coding agent -- runs parallel tasks, proposes PRs, and lives inside ChatGPT
Google Antigravity
Google's agent-first AI IDE -- deploys up to 5 autonomous coding agents in parallel on a VS Code fork. Antigravity 2.0 (I/O 2026) is the runtime substrate for Gemini Spark, and the Antigravity CLI is now the official successor to Gemini CLI, which retires for consumer tiers on 2026-06-18
Roblox Assistant
Roblox Studio's agentic AI that plans, builds, and playtests games. Planning Mode (2026-04-16) + Mesh Generation + Procedural Models brings 3D-native creation to 70M+ daily creators