K2.6 / K2.5 Architecture, Benchmarks, and Operating Modes | Kimi

0T params

Total model parameters

0B active

Active per token (MoE)

0K tokens

Context window

Expert modules

	K2.6 Score	K2.5 Score	Context
SWE-Bench Verified	80.2	76.8	Competitive with Claude Opus 4.6
SWE-Bench Pro	58.6	50.7	Leads GPT-5.4 and Claude Opus 4.6
SWE-Bench Multilingual	76.7	73.0	Near Claude Opus 4.6 level
AIME 2026	96.4	95.8	Advanced math competition problems
GPQA Diamond	90.5	87.6	Graduate level science questions
HLE-Full (with tools)	54.0	50.2	Leads GPT-5.4 and Claude Opus 4.6
LiveCodeBench (v6)	89.6	85.0	Real time coding evaluation
Terminal-Bench 2.0	66.7	50.8	Shell and terminal task completion
BrowseComp	83.2	74.9	Long-horizon web browsing agents

	Speed	Best For
Instant	Fastest	Quick questions, simple tasks, casual conversation
Thinking	Medium	Math, logic, complex analysis, coding problems
Agent	Slower	Research, multi-step tasks, file processing
Agent Swarm (beta)	Variable	Deep research, large codebase analysis, comprehensive investigations

Why MoE matters for cost

With only 32 billion parameters active per token out of 1 trillion total, K2.6 and K2.5 achieve frontier performance while using a fraction of the compute per inference. This is why API pricing can be $0.95/$4.00 (K2.6) or $0.60/$3.00 (K2.5) per million tokens while matching models that cost 5x to 25x more.

K2.5 in third party tools: the Cursor controversy

In early 2026, it was revealed that Cursor uses Kimi K2.5 in its Composer 2 feature. This sparked debate about model attribution and transparency, with users questioning whether AI coding tools should clearly disclose which underlying models power their features. On the positive side, it validates K2.5's coding capabilities at the frontier level, since a leading AI code editor chose it for a core feature.