SWE-bench Verified: 2026 AI Leaderboard
Fix real GitHub issues in 12 open-source Python repos.
What it tests
SWE-bench Verified is a 500-issue subset of SWE-bench that has been human-validated as solvable. Each task is a real Python GitHub issue; the model is given the repo, the issue, and must produce a patch that makes the project's test suite pass.
How it is scored
Percentage of issues where the generated patch passes all hidden tests. This is end-to-end agentic coding, not just code-completion. Scores above 70% are state-of-the-art; a year ago it was 30%.
Why it matters
SWE-bench Verified is the closest industry-standard benchmark to 'can this model actually do my job'. It rewards code-reading, multi-file editing, and test-driven iteration -- not just autocomplete.
Leaderboard (9 models)
Sorted by SWE-bench Verifiedscore. Tier column shows the tool's overall AIToolTier rank, which blends this benchmark with pricing, features, and real-world usability.
| # | Model | Tier | SWE-bench Verified score | Variant | Overall |
|---|---|---|---|---|---|
| 1 | Claude (Anthropic) Claude Opus 4.7 (4.6 baseline scores shown; 4.7 announced 13% coding lift, 3x production task completion) | A | 80.8% | SWE-bench Verified | 8.5/10 |
| 2 | Gemini (Google) Gemini 3.1 Ultra | A | 80.6% | SWE-bench Verified | 8.3/10 |
| 3 | MiniMax M2 / M2.5 MiniMax M2.5 (230B/10B active MoE) | A | 80.2% | SWE-Bench Verified | 8.4/10 |
| 4 | Kimi K2.5 (Moonshot) Kimi K2.5 (1T/32B active MoE) | A | 78.5% | SWE-Bench Verified | 8.1/10 |
| 5 | Codex (OpenAI) GPT-5.3-Codex | A | 72% | SWE-bench Verified | 8.3/10 |
| 6 | ChatGPT GPT-5.4 | A | 72% | SWE-bench Verified | 8.8/10 |
| 7 | Qwen (Alibaba) Qwen3.5-397B MoE | A | 69.4% | SWE-Bench Verified | 8.8/10 |
| 8 | DeepSeek DeepSeek V3.2 | A | 67.8% | SWE-bench Verified | 8.0/10 |
| 9 | GLM / Z.ai (Zhipu AI) GLM-5.1 (744B MoE / 40B active) | A | 64.2% | SWE-Bench Verified | 8.0/10 |
About SWE-bench Verified
- Creator
- Princeton & OpenAI, 2023 (Verified subset 2024)
- Unit
- % (max 100)
- Official source
- https://www.swebench.com/