Kimi K2.6 (Moonshot)

A Tier · 8.1/10

Moonshot's 1T-parameter MoE open-weights flagship -- Kimi K2.6 (GA 2026-04-20) is #1 open-weights on Artificial Analysis Intelligence Index v4.0 (score 54, ranked #4 overall). Native video input, 256K context, Modified MIT license

Last updated: 2026-05-13Free tier available

Score Breakdown

6.0

Ease of Use

9.0

Output Quality

8.5

Value

9.0

Features

Benchmark Scores

Benchmarks for Kimi K2.6 (1T/32B active MoE) -- Artificial Analysis Intelligence Index v4.0 score 54 (#1 open-weights, #4 overall as of 2026-04-27). MMLU/GPQA/AIME shown below are K2.5-baseline numbers retained until K2.6-specific third-party runs publish

Benchmark	Description	Score
SWE-Bench Pro		58.6%
MMLU-Pro (K2.5 baseline)		84.8%
GPQA Diamond (K2.5 baseline)		80.5%
AIME 2025 (K2.5 baseline)		91.2%
LiveCodeBench (K2.5 baseline)		74.1%

Last updated: 2026-04-27

Visit Kimi K2.6 (Moonshot)

Personality & Tone

The long-context note-taker

Tone: Careful and document-focused. Kimi K2.5 shines when you dump a long document in -- replies read as summary-and-citation rather than open chat, leaning on the source material rather than the model's opinions.

Quirks: Context handling is the whole pitch. Without a document to anchor to, replies feel plainer than Qwen or DeepSeek. Native Chinese quality is very strong; English is decent but not class-leading.

The Good and the Bad

What we like

+Frontier-tier performance -- Elo 1309 on GDPval-AA, behind only OpenAI and Anthropic flagships
+Beats Claude Opus 4.5 on several coding benchmarks per community testing
+Unified thinking + non-thinking modes in one model (no need to swap)
+256K context window handles large codebases for agentic coding
+Modified MIT license permits commercial use of weights
+Native tool-use and agentic planning trained in -- not bolted on

What could be better

−1T parameter model is impractical to self-host without 4+ H100-class GPUs
−Moonshot is a smaller lab than DeepSeek/Alibaba -- less Western infrastructure support
−API pricing ($0.60 in / $3.00 out) is higher than DeepSeek V3.2 ($0.28 in / $0.42 out)
−PRC content filters apply (Tiananmen, Taiwan, etc.)
−Documentation is heavily Chinese-first -- English docs trail releases

Pricing

Self-hosted (Free)

✓Modified MIT license -- commercial use allowed
✓Weights on Hugging Face
✓Fine-tuning permitted

API (Moonshot direct, K2.6)

$0.60/per 1M input tokens

✓K2.6: $0.60 in / $2.50 out (Moonshot direct)
✓256K context
✓Native video input (mp4/mov/avi/webm)

API (OpenRouter, K2.6 blended)

~$0.95/per 1M input tokens

✓K2.6: ~$0.95 in / ~$4.00 out via OpenRouter
✓Useful when you don't want a Moonshot account directly

System Requirements

Hardware needed to self-host. Min = smallest viable setup (usually heavy quantization). Max = full-precision / production-grade.

Model variant	Min	Max
Kimi K2.5 (1T total, 32B active MoE)Practically a hosted-only model for most users -- self-hosting requires enterprise hardware	256 GB unified RAM Mac Studio M3 Ultra (Q2, ~3 tok/s)	8× H200 141 GB FP8 or 16× H100 (production-grade)

Known Issues

WATCHLIST (verified 2026-05-13, Day 4 of ship window): Kimi K3 has NOT shipped. moonshotai HuggingFace org shows K2.6 as the latest model (last update 2 days ago); no Kimi-K3 repository exists. kimi.com/blog latest post remains 'Kimi K2.6 -- Advancing Open-Source Coding' (2026-04-20). Manifold market priced ~74% probability of K3 ship before end of May 2026; today is Day 4 of that window with no observable on-platform signal. If K3 lands before 2026-05-31 it likely beats Manifold's implied timeline; if it slips past 5/31 the market resolves NO. Watch: kimi.com/blog, huggingface.co/moonshotai, GitHub MoonshotAI/Kimi-K* releases.Source: kimi.com/blog (no new post since K2.6), huggingface.co/moonshotai (no K3 repo) · 2026-05-13
Kimi K2.6 (GA 2026-04-20) supersedes K2.5 -- 1T total / 32B active MoE, 256K context, adds native video input (mp4/mov/avi/webm). Scores 54 on Artificial Analysis Intelligence Index v4.0, ranked #1 open-weights and #4 overall (three points behind Claude Opus 4.7 / Gemini 3.1 Pro / OpenAI flagships at 57). SWE-Bench Pro 58.6%. Modified MIT license unchanged. Moonshot direct API: $0.60 in / $2.50 out per 1M tokens. OpenRouter blended: ~$0.95 in / $4.00 out. If you were on K2.5, the upgrade is non-breaking on the API side -- Moonshot routes the K2.6 model under the same endpoint familySource: Moonshot Kimi blog (kimi.com/blog/kimi-k2-6), HuggingFace moonshotai/Kimi-K2.6, Artificial Analysis, OpenRouter, SiliconANGLE · 2026-04-20
Self-hosting K2.5 / K2.6 at usable speed requires $30K+ in enterprise GPU hardware (8x H200 FP8 or 16x H100 production-grade) -- realistically this is a hosted-API model. Mac Studio M3 Ultra 256 GB unified RAM at Q2 quantization runs the model but at ~3 tok/sSource: Reddit r/LocalLLaMA, llm-stats.com · 2026-03
Early K2.5 releases had inconsistent tool-calling when quantized below Q4 -- community fixes landed March 2026; K2.6 inherits the same tool-use stack so quant guidance carries forwardSource: Hugging Face discussions · 2026-03

Best for

Agentic coding workflows, tool-use agents, and teams willing to pay hosted-API prices for frontier-tier quality with open-weights licensing protection.

Not for

Solo developers or hobbyists who want to run models locally -- the 1T parameter size makes that impractical. Use Qwen3-Coder-Next or DeepSeek instead for self-hosting.

Our Verdict

Kimi K2.5 is the best open-weights model in the world right now for agentic coding. It legitimately rivals Claude Opus 4.5 and Gemini 3.1 Pro on practical coding tasks while being nominally 'open.' The catch is that the 1T parameter size makes it hosted-only for 99% of users. If you're picking between hosted APIs and you want maximum quality with open-weights safety, Kimi K2.5 is the S-tier pick. If you need a model that actually runs on your hardware, look at Qwen3-Coder-Next or DeepSeek V3.2 instead.