K2.5 Architecture, Benchmarks, and Operating Modes

1 min read

0T params

Total model parameters

0B active

Active per token (MoE)

0K tokens

Context window

0

Expert modules

K2.5 ScoreContext
SWE-Bench Verified76.8Outperforms Gemini Pro 3.1
SWE-Bench Multilingual73.0Outperforms GPT-5.2 and Gemini Pro 3.1
AIME 202596.1Advanced math competition problems
GPQA Diamond87.6Graduate level science questions
HLE (with tools)50.2At 76% lower cost than Claude Opus 4.6
LiveCodeBench85.0Real time coding evaluation
MathVision84.2Mathematical visual reasoning
OCRBench92.3Optical character recognition accuracy
VideoMMMU86.6Video understanding and reasoning
WeirdML46%Unusual/edge case reasoning (lower than competitors)
SpeedBest For
InstantFastestQuick questions, simple tasks, casual conversation
ThinkingMediumMath, logic, complex analysis, coding problems
AgentSlowerResearch, multi-step tasks, file processing
Agent Swarm (beta)VariableDeep research, large codebase analysis, comprehensive investigations

Why MoE matters for cost

With only 32 billion parameters active per token out of 1 trillion total, K2.5 achieves frontier performance while using a fraction of the compute per inference. This is why API pricing can be $0.60/$3.00 per million tokens while matching models that cost 5x to 25x more.