Kimi K2 Thinking

Kimi K2 Thinking is an open-source model that operates as a "thinking agent," reasoning step-by-step while using tools to achieve state-of-the-art performance on various benchmarks. It is capable of executing up to 200-300 sequential tool calls without human intervention, allowing it to solve complex problems across a wide range of tasks. The model uses Quantization-Aware Training (QAT) to support INT4 inference, which provides a roughly 2x improvement in generation speed.

View model card in Model Garden

Model ID kimi-k2-thinking-maas
Launch stage GA
Supported inputs & outputs
  • Inputs:
    Text, Documents
  • Outputs:
    Text
Capabilities
Usage types
Versions
  • Kimi K2 Thinking
    • Launch stage: GA
    • Release date: Nov 13, 2025
Supported regions

Model availability

  • United States
    • global

ML processing

  • United States
    • Multi-region
Limits

global:

  • Max output: 262144
  • Context length: 262144

Pricing See Pricing.