Kimi K2.5 (Moonshot) logo
A

Kimi K2.5 (Moonshot)

A Tier · 8.1/10

Moonshot's 1T-parameter MoE open-weights flagship -- best open-source agentic coder, rivals Claude Opus 4.5

Last updated: 2026-04-13Free tier available

Score Breakdown

6.0
Ease of Use
9.0
Output Quality
8.5
Value
9.0
Features

Benchmark Scores

Benchmarks for Kimi K2.5 (1T/32B active MoE)

Chatbot Arena ELOHuman preference rating1309
BenchmarkScore
MMLU-Pro84.8%
GPQA Diamond80.5%
AIME 202591.2%
SWE-Bench Verified78.5%
LiveCodeBench74.1%

Last updated: 2026-04-13

The Good and the Bad

What we like

  • +Frontier-tier performance -- Elo 1309 on GDPval-AA, behind only OpenAI and Anthropic flagships
  • +Beats Claude Opus 4.5 on several coding benchmarks per community testing
  • +Unified thinking + non-thinking modes in one model (no need to swap)
  • +256K context window handles large codebases for agentic coding
  • +Modified MIT license permits commercial use of weights
  • +Native tool-use and agentic planning trained in -- not bolted on

What could be better

  • 1T parameter model is impractical to self-host without 4+ H100-class GPUs
  • Moonshot is a smaller lab than DeepSeek/Alibaba -- less Western infrastructure support
  • API pricing ($0.60 in / $3.00 out) is higher than DeepSeek V3.2 ($0.28 in / $0.42 out)
  • PRC content filters apply (Tiananmen, Taiwan, etc.)
  • Documentation is heavily Chinese-first -- English docs trail releases

Pricing

Self-hosted (Free)

$0
  • Modified MIT license -- commercial use allowed
  • Weights on Hugging Face
  • Fine-tuning permitted

API (Moonshot / OpenRouter)

$0.60/per 1M input tokens
  • K2.5-Reasoning: $0.60 in / $3.00 out
  • 256K context
  • Blended cost ~$1.07/M

System Requirements

Hardware needed to self-host. Min = smallest viable setup (usually heavy quantization). Max = full-precision / production-grade.

Model variantMinMax
Kimi K2.5 (1T total, 32B active MoE)Practically a hosted-only model for most users -- self-hosting requires enterprise hardware256 GB unified RAM Mac Studio M3 Ultra (Q2, ~3 tok/s)8× H200 141 GB FP8 or 16× H100 (production-grade)

Known Issues

  • Self-hosting K2.5 at usable speed requires $30K+ in enterprise GPU hardware -- realistically this is a hosted-API modelSource: Reddit r/LocalLLaMA, llm-stats.com · 2026-03
  • Early K2.5 releases had inconsistent tool-calling when quantized below Q4 -- community fixes landed March 2026Source: Hugging Face discussions · 2026-03

Best for

Agentic coding workflows, tool-use agents, and teams willing to pay hosted-API prices for frontier-tier quality with open-weights licensing protection.

Not for

Solo developers or hobbyists who want to run models locally -- the 1T parameter size makes that impractical. Use Qwen3-Coder-Next or DeepSeek instead for self-hosting.

Our Verdict

Kimi K2.5 is the best open-weights model in the world right now for agentic coding. It legitimately rivals Claude Opus 4.5 and Gemini 3.1 Pro on practical coding tasks while being nominally 'open.' The catch is that the 1T parameter size makes it hosted-only for 99% of users. If you're picking between hosted APIs and you want maximum quality with open-weights safety, Kimi K2.5 is the S-tier pick. If you need a model that actually runs on your hardware, look at Qwen3-Coder-Next or DeepSeek V3.2 instead.

Sources

  • Moonshot Kimi K2.5 release (accessed 2026-04-13)
  • Artificial Analysis GDPval-AA leaderboard (accessed 2026-04-13)
  • llm-stats.com (accessed 2026-04-13)
  • OpenRouter pricing (accessed 2026-04-13)
  • Reddit r/singularity, r/LocalLLaMA (accessed 2026-04-13)

Alternatives to Kimi K2.5 (Moonshot)

Llama 4 (Meta) logo

Llama 4 (Meta)

Meta's open-weights flagship family -- Scout (10M context), Maverick (multimodal 400B MoE), Behemoth in preview

B
7.9/10
Free tierFrom $0
Llama 4 Scout has a 10M token context wi...Llama 4 Maverick is natively multimodal ...
Updated 2026-04-13
Mistral AI logo

Mistral AI

European AI lab with open and commercial models that punch well above their size

B
7.5/10
Free tierFrom $0
Extremely competitive API pricing -- Mis...Open-weight models (Mistral 7B, Mixtral)...
Updated 2026-03-26
DeepSeek logo

DeepSeek

Near-frontier reasoning for pennies on the dollar -- the open-source LLM that made Silicon Valley nervous

A
8.0/10
Free tierFrom $0
Pricing is absurdly cheap compared to GP...DeepSeek-R1 reasoning model genuinely co...
Updated 2026-03-31
Gemma 4 (Google) logo

Gemma 4 (Google)

Google DeepMind's open-weights model family -- multimodal, 256K context, runs on edge devices

A
8.3/10
Free tierFrom $0
Apache 2.0 license -- truly permissive, ...Multimodal: handles text + image input (...
Updated 2026-04-08
Qwen (Alibaba) logo

Qwen (Alibaba)

Alibaba's open-weights family -- Qwen3.5, Qwen3-Coder-Next, Qwen3-VL, Qwen3-Max. Apache 2.0 flagship sizes.

A
8.8/10
Free tierFrom $0
Apache 2.0 license on the open sizes -- ...Qwen3-Coder-Next 80B-A3B runs on 8 GB VR...
Updated 2026-04-13
GLM / Z.ai (Zhipu AI) logo

GLM / Z.ai (Zhipu AI)

Zhipu AI's open-weights family -- GLM-4.6 text flagship and GLM-4.6V multimodal, true MIT licensed

A
8.0/10
Free tierFrom $0
True MIT license -- one of the few front...GLM-4.6 is SOTA among open models for ag...
Updated 2026-04-13
Nemotron (Nvidia) logo

Nemotron (Nvidia)

Nvidia's open-weights family -- hybrid Mamba-Transformer MoE architecture, optimized for efficient reasoning on Nvidia hardware

B
7.8/10
Free tierFrom $0
Hybrid Mamba-Transformer architecture dr...Nemotron 3 Super activates only 3.6B par...
Updated 2026-04-13
MiniMax M2 / M2.5 logo

MiniMax M2 / M2.5

MiniMax's open-weights frontier -- first open model to match Claude Opus 4.6 on SWE-Bench at 10-20× lower cost

A
8.4/10
Free tierFrom $0
First open-weight model to hit 80.2% on ...~10B active params during inference (out...
Updated 2026-04-13
Falcon (TII) logo

Falcon (TII)

UAE's Technology Innovation Institute open-weights family -- Falcon 3 optimized for efficient sub-10B deployment on consumer hardware

B
7.1/10
Free tierFrom $0
Apache 2.0 license -- fully permissive f...Sub-10B sizes run on any consumer GPU or...
Updated 2026-04-13