MiniMax M2 / M2.5 logo
A

MiniMax M2 / M2.5

A Tier · 8.4/10

MiniMax's open-weights frontier -- first open model to match Claude Opus 4.6 on SWE-Bench at 10-20× lower cost

Last updated: 2026-04-13Free tier available

Score Breakdown

6.5
Ease of Use
9.0
Output Quality
9.5
Value
8.5
Features

Benchmark Scores

Benchmarks for MiniMax M2.5 (230B/10B active MoE)

BenchmarkScore
MMLU-Pro82.1%
GPQA Diamond76.8%
SWE-Bench Verified80.2%
HumanEval91%
AIME 202585.3%

Last updated: 2026-04-13

The Good and the Bad

What we like

  • +First open-weight model to hit 80.2% on SWE-Bench Verified -- matching Claude Opus 4.6
  • +~10B active params during inference (out of 230B) means fast and cheap to run
  • +MIT license with zero commercial restrictions
  • +Native agentic and tool-use training -- not bolted on
  • +Per-layer QK-Norm + full-attention blocks make long-context stable
  • +10-20× cheaper than closed frontier models at similar quality per Bytebot analysis

What could be better

  • Smaller Western community than Qwen/DeepSeek -- tutorials sparse
  • Ollama support arrived late -- community relied on vLLM for months
  • English writing tone is noticeably less polished than Claude or Mistral
  • PRC content filters apply
  • MiniMax as a lab is less well-known than Alibaba or DeepSeek -- some enterprise buyers hesitate

Pricing

Self-hosted (Free)

$0
  • MIT license on M2 and M2.5
  • Weights on Hugging Face
  • Commercial use permitted

API (OpenRouter / MiniMax)

$0.30/per 1M input tokens
  • M2: $0.30 in / $1.20 out
  • 192K+ context
  • Native agentic + tool-use

System Requirements

Hardware needed to self-host. Min = smallest viable setup (usually heavy quantization). Max = full-precision / production-grade.

Model variantMinMax
MiniMax M2 / M2.5 (230B total, ~10B active MoE)Sparse MoE activates only ~10B params during inference -- fast tok/s on moderate hardware96 GB unified RAM Q3 (Mac M3 Ultra)4× A100 80 GB FP8
MiniMax M1 (hybrid-attention reasoning predecessor)96 GB unified RAM Q34× A100 80 GB FP8

Known Issues

  • M2 initial release required custom vLLM build -- community quants took 2-3 weeks to stabilizeSource: GitHub MiniMax-AI/MiniMax-M2, Hugging Face discussions · 2026-02
  • Per-layer QK-Norm is non-standard -- some inference backends had subtle bugs at long contextSource: Reddit r/LocalLLaMA · 2026-03

Best for

Agentic coding and tool-use workflows on a budget. Best price-to-SWE-Bench ratio of any open-weights model in 2026.

Not for

Teams that prioritize polished English writing (Mistral Large 3 or Claude are better), or anyone who needs the deepest ecosystem support (Llama is still that).

Our Verdict

MiniMax M2/M2.5 is the most cost-efficient frontier-tier open model in 2026. The 80.2% SWE-Bench Verified score is a genuine breakthrough -- matching Claude Opus 4.6 on real coding tasks at a tenth of the price. The sparse 10B-active MoE runs fast on moderate hardware. The main drawback is ecosystem: MiniMax has less Western infrastructure support than Alibaba or DeepSeek. If you're building an agentic product and want maximum value per token, M2.5 is an A-tier pick.

Sources

  • Artificial Analysis MiniMax M2 benchmarks (accessed 2026-04-13)
  • Bytebot MiniMax M2.5 analysis (accessed 2026-04-13)
  • Hugging Face MiniMaxAI collection (accessed 2026-04-13)
  • GitHub MiniMax-AI/MiniMax-M2 (accessed 2026-04-13)
  • OpenRouter pricing (accessed 2026-04-13)

Alternatives to MiniMax M2 / M2.5

Llama 4 (Meta) logo

Llama 4 (Meta)

Meta's open-weights flagship family -- Scout (10M context), Maverick (multimodal 400B MoE), Behemoth in preview

B
7.9/10
Free tierFrom $0
Llama 4 Scout has a 10M token context wi...Llama 4 Maverick is natively multimodal ...
Updated 2026-04-13
Mistral AI logo

Mistral AI

European AI lab with open and commercial models that punch well above their size

B
7.5/10
Free tierFrom $0
Extremely competitive API pricing -- Mis...Open-weight models (Mistral 7B, Mixtral)...
Updated 2026-03-26
DeepSeek logo

DeepSeek

Near-frontier reasoning for pennies on the dollar -- the open-source LLM that made Silicon Valley nervous

A
8.0/10
Free tierFrom $0
Pricing is absurdly cheap compared to GP...DeepSeek-R1 reasoning model genuinely co...
Updated 2026-03-31
Gemma 4 (Google) logo

Gemma 4 (Google)

Google DeepMind's open-weights model family -- multimodal, 256K context, runs on edge devices

A
8.3/10
Free tierFrom $0
Apache 2.0 license -- truly permissive, ...Multimodal: handles text + image input (...
Updated 2026-04-08
Qwen (Alibaba) logo

Qwen (Alibaba)

Alibaba's open-weights family -- Qwen3.5, Qwen3-Coder-Next, Qwen3-VL, Qwen3-Max. Apache 2.0 flagship sizes.

A
8.8/10
Free tierFrom $0
Apache 2.0 license on the open sizes -- ...Qwen3-Coder-Next 80B-A3B runs on 8 GB VR...
Updated 2026-04-13
GLM / Z.ai (Zhipu AI) logo

GLM / Z.ai (Zhipu AI)

Zhipu AI's open-weights family -- GLM-4.6 text flagship and GLM-4.6V multimodal, true MIT licensed

A
8.0/10
Free tierFrom $0
True MIT license -- one of the few front...GLM-4.6 is SOTA among open models for ag...
Updated 2026-04-13
Kimi K2.5 (Moonshot) logo

Kimi K2.5 (Moonshot)

Moonshot's 1T-parameter MoE open-weights flagship -- best open-source agentic coder, rivals Claude Opus 4.5

A
8.1/10
Free tierFrom $0
Frontier-tier performance -- Elo 1309 on...Beats Claude Opus 4.5 on several coding ...
Updated 2026-04-13
Nemotron (Nvidia) logo

Nemotron (Nvidia)

Nvidia's open-weights family -- hybrid Mamba-Transformer MoE architecture, optimized for efficient reasoning on Nvidia hardware

B
7.8/10
Free tierFrom $0
Hybrid Mamba-Transformer architecture dr...Nemotron 3 Super activates only 3.6B par...
Updated 2026-04-13
Falcon (TII) logo

Falcon (TII)

UAE's Technology Innovation Institute open-weights family -- Falcon 3 optimized for efficient sub-10B deployment on consumer hardware

B
7.1/10
Free tierFrom $0
Apache 2.0 license -- fully permissive f...Sub-10B sizes run on any consumer GPU or...
Updated 2026-04-13