IndustrieJun 18, 2026

Kimi K2.7 Code: Moonshot AI's 1T Open-Source Coding Agent That Cuts Reasoning Tokens by 30%

A coding-specialist successor to K2.6 — better benchmarks, ~30% fewer reasoning tokens, and a permissive open-weight license

Douglas Lai

Share to

Most AI coding models are still scored on one-shot completions. Kimi K2.7 Code is built for something harder: long-horizon software engineering where correctness and cost are dominated by planning and orchestration, not autocomplete. It's Moonshot AI's latest open-source coding model — a 1-trillion-parameter Mixture-of-Experts (MoE) tuned specifically for autonomous agents — and it posts a reported 21.8% gain on Kimi Code Bench v2 over K2.6 while using roughly 30% fewer reasoning tokens. It ships under a permissive Modified MIT license with weights live on Hugging Face. (MarkTechPost)

This guide breaks down what K2.7 Code actually is, how it improves on K2.6, where it sits in the K2 family, its licensing and pricing, and the patterns builders are using to put it to work inside agent platforms like Eigent.

What Is Kimi K2.7 Code?

Kimi K2.7 Code is an open-weight, coding-specialized large language model from Moonshot AI, built on the K2.6 architecture but tuned explicitly for code generation, software-engineering workflows, and agentic tool use. Rather than a general conversational model, Moonshot describes it as a long-horizon coding agent designed to power "real work" inside editors, terminals, and multi-tool orchestration runtimes. (Kimi)

Under the hood, K2.7 Code uses a Mixture-of-Experts architecture with 1 trillion total parameters and about 32 billion active parameters per token — the same structural scale as K2.6, but with updated experts and routing tuned for coding. It ships with a 256k-token context window, letting agents keep large codebases, logs, and multi-step plans in context without aggressive chunking. (FAQ.com)

The weights are open-sourced under a Modified MIT license, with artifacts on Hugging Face, and K2.7 Code is also reachable via the Kimi API and the Kimi Code product. That combination — frontier-class coding performance plus permissive licensing — makes it immediately relevant to teams building proprietary AI developers, code copilots, and autonomous agent platforms. It's the coding-focused cousin of the other open-weight flagships we've covered, like Zhipu's GLM-5.2 and DeepSeek V4 Pro. (Flowtivity)

Key Improvements Over K2.6 for Coding and Agents

Moonshot's own benchmarks and third-party writeups highlight three headline changes: better coding accuracy, higher agentic success, and lower reasoning-token usage. (Noqta)

+21.8% on Kimi Code Bench v2 vs K2.6 — K2.7 Code improves on Kimi's internal coding benchmark over K2.6, which already led many closed models on difficult public coding tasks. (MarkTechPost)
+11.0% on Program Bench and +31.5% on MLS Bench Lite — in Kimi's release thread, K2.7 Code is reported to outperform K2.6 on both, targeting program synthesis and multi-step reasoning around code. (Moonshot)
~30% fewer reasoning tokens — the model is explicitly optimized for "less overthinking," burning around 30% fewer reasoning tokens than K2.6 on comparable tasks while achieving better results. (FAQ.com)

For teams running deep tree-of-thought or tool-heavy agents, that last point is critical: cutting reasoning tokens by 30% directly lowers cost and latency for the same or better task-success rate. Instead of trading chain depth against cost, K2.7 Code aims to deliver both more efficient reasoning and higher benchmark scores. (Noqta)

How K2.7 Code Fits Into the K2 Family

It helps to see K2.7 Code as the latest "code agent" specialization on top of a rapidly evolving K2 line.

Kimi K2.5: visual agentic intelligence with swarm execution

K2.5 introduced Kimi's "visual agentic intelligence" story — a native multimodal model mixing language, code, and vision. Trained on around 15 trillion mixed visual and text tokens with a dedicated MoonViT vision encoder (~400M parameters), it excelled at turning screenshots, UI designs, and documents into working interfaces and structured outputs. (Hugging Face)

It also shipped Agent Swarm, a runtime coordinating up to ~100 sub-agents and ~1,500 tool calls per task, delivering about 4.5× faster execution than a single-agent setup on wide search workloads. Across HLE, BrowseComp, MMMU Pro, VideoMMMU, and SWE-Bench Verified, K2.5 hit state-of-the-art numbers among open-weight models. (InfoQ)

Kimi K2.6: 1T open-weights model for long-horizon agent swarms

K2.6 took those ideas further with a 1T-parameter MoE backbone (32B active) and a 256k context window, aimed squarely at long-horizon coding and large-scale agent swarms. It's open-weight under a Modified MIT license, supports multimodal input, and targets repo-scale refactoring, coding-driven design, and multi-hour research automation. (MyAIGuide)

Moonshot and partner analyses report that K2.6:

Outscores GPT-5.4 and Claude Opus-class models on SWE-Bench Pro and Humanity's Last Exam — the first open-weights model to claim the top spot on both simultaneously. (API易)
Can run 12+-hour autonomous jobs with up to ~300 parallel sub-agents and ~4,000 coordinated tool calls in a single run, maintaining a coherent plan and state across the swarm. (Halmob)
Is deployable on H100-class hardware with vLLM or SGLang, leveraging native INT4 weights and open licensing for cost-efficient self-hosting. (AllThings.how)

Kimi K2.7 Code: coding specialist on top of K2.6

K2.7 Code sits on this foundation as a coding-specialist successor to K2.6, preserving the 1T MoE scale and 256k context but re-optimizing for long-horizon code and agentic reasoning. Instead of trying to be the best general conversationalist, it leans into "coding agent" as its core identity. (MarkTechPost)

Model	Focus	Architecture / context	Agent swarm capability
K2.5	Visual agentic intelligence (vision + code + research)	1T MoE, ~32B active, 256k context, multimodal vision encoder	Swarm to ~100 sub-agents, ~1,500 tool calls
K2.6	Long-horizon coding + agent swarms	1T MoE, 32B active, 256k context, INT4, multimodal	Swarm to ~300 sub-agents, ~4,000 steps / 12+ hours
K2.7 Code	Coding-focused open-source agent	1T MoE, 32B active, 256k context, coding-tuned	Improved over K2.6, with fewer reasoning tokens

For teams already exploring K2.5/K2.6 as open-weights alternatives to GPT- and Claude-class APIs, K2.7 Code is less about a new architecture and more about better coding performance-per-token.

Licensing, Pricing, and Deployment Options

A major part of K2.7 Code's appeal is the combination of open weights and relatively permissive licensing.

Modified MIT license — permits large-scale commercial use with attribution as the main constraint. For enterprises and startups alike, that's materially more flexible than many "open" licenses that restrict competition or scale. (Flowtivity)
Open-weight distribution on Hugging Face — weights and deployment guides are available there, mirroring K2.5 and K2.6, making it straightforward to pull into vLLM, SGLang, or custom inference stacks. (Kimi)
API pricing — for hosted access, the Kimi API prices K2.7 Code at around $0.95 per million input tokens and $4.00 per million output tokens, competitive for large-context coding workloads. (FAQ.com)

K2.6 is already hosted by multiple third-party providers (Novita, Baseten, Fireworks, Parasail), and K2.7 Code is expected to follow — reducing friction for experimentation and hybrid self-hosted/hosted setups. For regulated environments, the ability to start on the API and graduate to self-hosting under the same model family is a strong adoption story. (MyAIGuide)

Why K2.7 Code Matters for AI Coding Agents

Most AI coding tools to date have been single-session copilots: autocomplete, inline explanations, occasional refactors. K2.7 Code is explicitly designed for long-running, multi-tool, agent-style workflows, where cost and correctness are dominated by the planning horizon rather than one-shot completions. (Kimi) Three aspects stand out:

Long-horizon planning with 256k context. A 256k window lets agents hold full-project state — codebase snapshots, design docs, logs, test outputs — in a single prompt, instead of chunking aggressively and relying on brittle retrieval heuristics. That makes end-to-end tasks like "port this service from Node to Rust including tests and CI" or "rewrite this mobile app with a new design system" feasible in one orchestrated run. (Halmob)
Efficient reasoning for deep chains. The ~30% reduction in reasoning tokens over K2.6 makes deeper chains less painful — you can afford more tool calls, more introspection, and more branch-and-bound exploration without blowing up cost and latency. For autonomous developers that think in trees rather than lines, that efficiency gain is strategically important. (Moonshot)
Open weights in a frontier-class regime. K2.6 already matched or beat GPT-5-class models on SWE-Bench Pro and HLE while staying open-weight; K2.7 Code inherits that lineage while targeting coding and agent tasks specifically. Teams no longer have to accept a big capability gap to get open-source flexibility. (API易)

In practice, that makes K2.7 Code an attractive backbone for:

Autonomous "AI dev" agents that own a repo over weeks rather than minutes.
CI/CD-integrated agents that triage, fix, and validate issues across multiple services.
Internal platform agents that scaffold new products, perform migrations, or enforce cross-cutting concerns like observability and security.

How to Start Evaluating Kimi K2.7 Code

If you're already experimenting with open-weights coding models, integrating K2.7 Code can be a focused evaluation sprint rather than a full redesign.

Swap K2.6 ↔ K2.7 Code in existing pipelines. If you already use K2.6 for long-horizon coding or agent experiments, drop K2.7 Code into the same flows (multi-step refactors, multi-agent code reviews, docs + tests generation) and measure task success, tool-call count, and token usage. (Noqta)
Test realistic "12-hour agent" scenarios, not toy tasks. Use the swarm-oriented design for what it's built for: multi-hour jobs spanning planning, coding, running, and debugging across services and repos. The more your eval looks like real work, the more the long-horizon and efficiency wins show up. (i-SCOOP)
Decide your hosting strategy early. Start on the Kimi API for quick iteration, then plan a migration path to self-hosting on H100-class hardware if you need strict data residency or cost control at scale. The Modified MIT license and Hugging Face distribution are deliberately aligned with that trajectory. (AllThings.how)

What K2.7 Code Means for Engineering and Product Teams

Kimi K2.7 Code signals that frontier-class coding and agent models are no longer the exclusive domain of closed APIs — open weights with flexible licensing can now credibly compete on both benchmarks and real-world workflows. For engineering and product teams, that opens a new design space: agents deeply embedded into infrastructure and culture without being locked into a single vendor. (MyAIGuide)

If you're building developer platforms, autonomous coding agents, or AI coworkers, K2.7 Code is worth treating not as "another model" but as a candidate default backbone — especially when long-horizon work, multi-tool orchestration, and on-prem or VPC deployment are non-negotiable. (Halmob)

This is exactly the case for model-agnostic, multi-agent infrastructure. The model landscape moves fast, and the platforms that win are the ones that can slot in a coding specialist like K2.7 Code for the work it's best at — without re-architecting the whole stack. If that's the kind of foundation you're building on, explore how the open-source, multi-agent platform Eigent lets you orchestrate specialized models across real-world workflows.

Frequently Asked Questions

What is Kimi K2.7 Code?

Kimi K2.7 Code is Moonshot AI's open-weight, coding-specialized large language model — a 1-trillion-parameter Mixture-of-Experts (32B active per token) with a 256k-token context window, built on the K2.6 architecture but tuned for code generation, software-engineering workflows, and agentic tool use. It's released under a Modified MIT license with weights on Hugging Face.

How is K2.7 Code better than K2.6?

K2.7 Code reports a 21.8% improvement on Kimi Code Bench v2 over K2.6, plus +11.0% on Program Bench and +31.5% on MLS Bench Lite — while using roughly 30% fewer reasoning tokens on comparable tasks. The net effect is better coding accuracy and agentic success at lower cost and latency.

What does "30% fewer reasoning tokens" mean in practice?

K2.7 Code is optimized to "overthink" less, so it reaches better results with shorter reasoning chains. For tool-heavy or tree-of-thought agents, that translates directly into lower token cost and latency for the same — or higher — task-success rate, making deeper chains and more tool calls affordable.

How much does Kimi K2.7 Code cost?

Via the Kimi API, K2.7 Code is priced at roughly $0.95 per million input tokens and $4.00 per million output tokens. Because the weights are open under a Modified MIT license, teams can also self-host on H100-class hardware with vLLM or SGLang for cost control at scale.

Is Kimi K2.7 Code open source?

Yes. It's distributed as open weights on Hugging Face under a Modified MIT license that permits large-scale commercial use with attribution. It's also available via the Kimi API and the Kimi Code product, with third-party hosting expected to follow K2.6's providers.

Can I use Kimi K2.7 Code with Eigent?

Yes. Eigent's model-agnostic, multi-agent architecture lets you route coding and long-horizon tasks to K2.7 Code through its MCP tools and Skills framework — using its 256k context and token-efficient reasoning for repo-scale work while keeping other models for routine tasks.

Kimi K2.7 Code: Moonshot AI's 1T Open-Source Coding Agent That Cuts Reasoning Tokens by 30%

A coding-specialist successor to K2.6 — better benchmarks, ~30% fewer reasoning tokens, and a permissive open-weight license

Douglas Lai

Share to

What Is Kimi K2.7 Code?

Key Improvements Over K2.6 for Coding and Agents

Moonshot's own benchmarks and third-party writeups highlight three headline changes: better coding accuracy, higher agentic success, and lower reasoning-token usage. (Noqta)

+21.8% on Kimi Code Bench v2 vs K2.6 — K2.7 Code improves on Kimi's internal coding benchmark over K2.6, which already led many closed models on difficult public coding tasks. (MarkTechPost)
+11.0% on Program Bench and +31.5% on MLS Bench Lite — in Kimi's release thread, K2.7 Code is reported to outperform K2.6 on both, targeting program synthesis and multi-step reasoning around code. (Moonshot)
~30% fewer reasoning tokens — the model is explicitly optimized for "less overthinking," burning around 30% fewer reasoning tokens than K2.6 on comparable tasks while achieving better results. (FAQ.com)

How K2.7 Code Fits Into the K2 Family

It helps to see K2.7 Code as the latest "code agent" specialization on top of a rapidly evolving K2 line.

Kimi K2.5: visual agentic intelligence with swarm execution

Kimi K2.6: 1T open-weights model for long-horizon agent swarms

Moonshot and partner analyses report that K2.6:

Outscores GPT-5.4 and Claude Opus-class models on SWE-Bench Pro and Humanity's Last Exam — the first open-weights model to claim the top spot on both simultaneously. (API易)
Can run 12+-hour autonomous jobs with up to ~300 parallel sub-agents and ~4,000 coordinated tool calls in a single run, maintaining a coherent plan and state across the swarm. (Halmob)
Is deployable on H100-class hardware with vLLM or SGLang, leveraging native INT4 weights and open licensing for cost-efficient self-hosting. (AllThings.how)

Kimi K2.7 Code: coding specialist on top of K2.6

Model	Focus	Architecture / context	Agent swarm capability
K2.5	Visual agentic intelligence (vision + code + research)	1T MoE, ~32B active, 256k context, multimodal vision encoder	Swarm to ~100 sub-agents, ~1,500 tool calls
K2.6	Long-horizon coding + agent swarms	1T MoE, 32B active, 256k context, INT4, multimodal	Swarm to ~300 sub-agents, ~4,000 steps / 12+ hours
K2.7 Code	Coding-focused open-source agent	1T MoE, 32B active, 256k context, coding-tuned	Improved over K2.6, with fewer reasoning tokens

For teams already exploring K2.5/K2.6 as open-weights alternatives to GPT- and Claude-class APIs, K2.7 Code is less about a new architecture and more about better coding performance-per-token.

Licensing, Pricing, and Deployment Options

A major part of K2.7 Code's appeal is the combination of open weights and relatively permissive licensing.

Modified MIT license — permits large-scale commercial use with attribution as the main constraint. For enterprises and startups alike, that's materially more flexible than many "open" licenses that restrict competition or scale. (Flowtivity)
Open-weight distribution on Hugging Face — weights and deployment guides are available there, mirroring K2.5 and K2.6, making it straightforward to pull into vLLM, SGLang, or custom inference stacks. (Kimi)
API pricing — for hosted access, the Kimi API prices K2.7 Code at around $0.95 per million input tokens and $4.00 per million output tokens, competitive for large-context coding workloads. (FAQ.com)

Why K2.7 Code Matters for AI Coding Agents

Long-horizon planning with 256k context. A 256k window lets agents hold full-project state — codebase snapshots, design docs, logs, test outputs — in a single prompt, instead of chunking aggressively and relying on brittle retrieval heuristics. That makes end-to-end tasks like "port this service from Node to Rust including tests and CI" or "rewrite this mobile app with a new design system" feasible in one orchestrated run. (Halmob)
Efficient reasoning for deep chains. The ~30% reduction in reasoning tokens over K2.6 makes deeper chains less painful — you can afford more tool calls, more introspection, and more branch-and-bound exploration without blowing up cost and latency. For autonomous developers that think in trees rather than lines, that efficiency gain is strategically important. (Moonshot)
Open weights in a frontier-class regime. K2.6 already matched or beat GPT-5-class models on SWE-Bench Pro and HLE while staying open-weight; K2.7 Code inherits that lineage while targeting coding and agent tasks specifically. Teams no longer have to accept a big capability gap to get open-source flexibility. (API易)

In practice, that makes K2.7 Code an attractive backbone for:

Autonomous "AI dev" agents that own a repo over weeks rather than minutes.
CI/CD-integrated agents that triage, fix, and validate issues across multiple services.
Internal platform agents that scaffold new products, perform migrations, or enforce cross-cutting concerns like observability and security.

How to Start Evaluating Kimi K2.7 Code

If you're already experimenting with open-weights coding models, integrating K2.7 Code can be a focused evaluation sprint rather than a full redesign.

Swap K2.6 ↔ K2.7 Code in existing pipelines. If you already use K2.6 for long-horizon coding or agent experiments, drop K2.7 Code into the same flows (multi-step refactors, multi-agent code reviews, docs + tests generation) and measure task success, tool-call count, and token usage. (Noqta)
Test realistic "12-hour agent" scenarios, not toy tasks. Use the swarm-oriented design for what it's built for: multi-hour jobs spanning planning, coding, running, and debugging across services and repos. The more your eval looks like real work, the more the long-horizon and efficiency wins show up. (i-SCOOP)
Decide your hosting strategy early. Start on the Kimi API for quick iteration, then plan a migration path to self-hosting on H100-class hardware if you need strict data residency or cost control at scale. The Modified MIT license and Hugging Face distribution are deliberately aligned with that trajectory. (AllThings.how)

What Is Kimi K2.7 Code?

Key Improvements Over K2.6 for Coding and Agents

How K2.7 Code Fits Into the K2 Family

Kimi K2.5: visual agentic intelligence with swarm execution

Kimi K2.6: 1T open-weights model for long-horizon agent swarms

Kimi K2.7 Code: coding specialist on top of K2.6

Licensing, Pricing, and Deployment Options

Why K2.7 Code Matters for AI Coding Agents

How to Start Evaluating Kimi K2.7 Code

What K2.7 Code Means for Engineering and Product Teams