NativeAIHub

K2.5 Architecture, Benchmarks, and Operating Modes

1 min read

0T params

Total model parameters

0B active

Active per token (MoE)

0K tokens

Context window

0

Expert modules

K2.5 ScoreContext
SWE-Bench Verified76.8Outperforms Gemini Pro 3.1
SWE-Bench Multilingual73.0Outperforms GPT-5.2 and Gemini Pro 3.1
AIME 202596.1Advanced math competition problems
GPQA Diamond87.6Graduate level science questions
HLE (with tools)50.2At 76% lower cost than Claude Opus 4.6
LiveCodeBench85.0Real time coding evaluation
MathVision84.2Mathematical visual reasoning
OCRBench92.3Optical character recognition accuracy
VideoMMMU86.6Video understanding and reasoning
WeirdML46%Unusual/edge case reasoning (lower than competitors)
SpeedBest For
InstantFastestQuick questions, simple tasks, casual conversation
ThinkingMediumMath, logic, complex analysis, coding problems
AgentSlowerResearch, multi-step tasks, file processing
Agent Swarm (beta)VariableDeep research, large codebase analysis, comprehensive investigations

Why MoE matters for cost

With only 32 billion parameters active per token out of 1 trillion total, K2.5 achieves frontier performance while using a fraction of the compute per inference. This is why API pricing can be $0.60/$3.00 per million tokens while matching models that cost 5x to 25x more.

K2.5 in third party tools: the Cursor controversy

In early 2026, it was revealed that Cursor uses Kimi K2.5 in its Composer 2 feature. This sparked debate about model attribution and transparency, with users questioning whether AI coding tools should clearly disclose which underlying models power their features. On the positive side, it validates K2.5's coding capabilities at the frontier level, since a leading AI code editor chose it for a core feature.