Skip to content

feat: detect always-thinking models and surface them as always-on in the model selector#613

Draft
7Sageer wants to merge 7 commits into
MoonshotAI:mainfrom
7Sageer:feat-always-thinking-capability
Draft

feat: detect always-thinking models and surface them as always-on in the model selector#613
7Sageer wants to merge 7 commits into
MoonshotAI:mainfrom
7Sageer:feat-always-thinking-capability

Conversation

@7Sageer

@7Sageer 7Sageer commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Related Issue

Follow-up to #610 — the problem is explained below.

Problem

Some models cannot run with thinking turned off: claude-fable-5 today (adaptive thinking is always on per the official docs), OpenAI o-series reasoning models, and Gemini 2.5 Pro. For these models the model selector still offers a thinking on/off toggle; selecting "off" silently runs with thinking anyway — the status line reads off while thinking tokens are billed.

The TUI already has the right rendering for this: thinkingAvailability returns 'always-on' (no toggle) when a model resolves to always_thinking. But nothing populated that capability automatically — users would have to hand-declare it per model alias.

What changed

Capabilities are resolved at read time through a single shared resolver, with kosong's ModelCapability as the single source of truth. Config objects stay pure declarations — nothing is materialized into models.<alias>.capabilities, so getConfig→setConfig round-trips persist snapshots verbatim and future corrections to the detection knowledge take effect without migrating persisted state.

kosong

  • ModelCapability gains an optional always_thinking flag, typed true (absent = toggleable; present implies thinking).
  • The Claude id parser moves to claude-version.ts, and isFableModel there is the single predicate shared by the anthropic wire layer (omitting thinking: disabled on Fable) and the capability registry — the advertised capability and the request-building behavior cannot drift. This covers vendor-prefixed ids like us.anthropic.claude-fable-5-v1:0 and version-less ids like claude-fable-latest.
  • OpenAI o-series and Gemini 2.5 Pro are marked always_thinking, matching their wire behavior (withThinking('off') omits reasoning_effort and the server still reasons; Gemini 2.5 Pro rejects thinking_budget: 0).
  • Catalog schema gains an always_reasoning extension field (models.dev-compatible: absent means false), mapped onto always_thinking — the route for any model that declares it through a catalog rather than kosong's built-in knowledge.
  • New getProviderModelCapability(type, model) export: a pure lookup of the same per-provider mapping ChatProvider.getCapability exposes, without constructing a provider.

agent-core

  • New resolveAliasCapabilities(providerType, alias): the single place declared capability strings and kosong detection meet. The merge is additive — declarations can add capabilities on top of detection but never veto it.
  • Session capability resolution (provider-manager) now goes through this resolver, replacing the previous capability probe that constructed a throwaway provider with a placeholder API key.

node-sdk / consumers

  • resolveAliasCapabilities is re-exported for UIs; the TUI model selector and ACP's model catalog call it instead of interpreting raw capabilities strings themselves.
  • ACP's AcpModelEntry carries alwaysThinking and the session config options omit the Off/On thinking select for such models — ACP's select arm has no read-only entry, and an "off" choice would misreport a model that reasons (and bills thinking tokens) regardless. Same dynamic-visibility rule as non-thinking models.
  • catalogModelToAlias and custom api.json registries materialize always_thinking alongside thinking for catalog-declared always-reasoning models, so consumers checking plain 'thinking' membership keep working.

With this, picking claude-fable-5 (or an o-series / Gemini 2.5 Pro alias) shows Thinking as always-on with no off toggle, and catalog-added always-reasoning models behave the same.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

Review follow-ups (addressed in-place)

Earlier revisions materialized detected capabilities into the runtime config at load time and stripped them back out of write-back patches. Review found two problems with that: the strip logic re-implemented a slice of merge semantics, and its "declarations are preserved verbatim" contract did not actually hold for models kosong knows about. The latest revision removes the enrich/strip pair entirely in favor of read-time resolution — there is no detected state to leak to disk, so no strip is needed. It also replaces the capability probe in provider-manager with the new pure lookup, and normalization of capability strings now lives in exactly one place (the resolver).

A second review pass tightened the result further: the Fable matcher's prefix branch moved into isFableModel itself so the wire layer and the registry use one predicate (previously the registry was wider — a version-less id like claude-fable-latest advertised always-on while --thinking off would still send the explicit disabled config Fable 400s on); always_thinking is typed true so no consumer needs === true narrowing; the always-thinking registry constants derive from their toggleable base via spread; and the resolver's docs now state explicitly that unknown detection flattens to false, which is why permissive-by-default consumers (the TUI media-attachment gate) keep reading declared strings instead.

The same pass caught a regression this PR would have introduced on the ACP surface: deriveThinkingSupported turning true for Fable made ACP advertise an Off/On thinking select for models whose thinking cannot be turned off (previously they had no toggle at all, since nothing classified them as thinking) — and with defaultThinking unset the select even started on "Thinking Off". Fixed by carrying alwaysThinking on AcpModelEntry and omitting the thinking option for such models. The name-regex route (/thinking|reason/i) deliberately never sets alwaysThinking: it cannot tell always-on variants from toggleable ones, so those models keep their toggle as before.

A third pass hardened the resolution boundaries. listModelsFromHarness no longer dereferences config.providers outside its try block: getConfig crosses an RPC/stub boundary where a partial snapshot can omit the field the static type marks required, and the resulting TypeError escaped the catch (the harness mock goes back to omitting providers so that path stays exercised). The model selector's providers prop narrows to Record<string, Pick<ProviderConfig, 'type'>> — the component only reads the wire type, so its signature no longer claims credentials; callers keep passing their full ProviderConfig map via structural subtyping. isFableModel's version-less prefix branch anchors at a separator so ids merely containing the substring (claude-fabled-x) don't classify. And a cross-check test pins getProviderModelCapability to each ChatProvider.getCapability across every provider type, so the pure lookup and the instance path cannot silently drift apart.

Known limitation (intentionally out of scope)

Session-level thinking resolution (resolveThinkingLevel) still does not consult model capabilities: with default_thinking = false (or explicit --thinking off via CLI/ACP), an always-thinking model resolves to thinkingLevel: "off" — the status line reads off while the model runs adaptive thinking anyway (the request itself is correct; kosong omits the thinking field). This is pre-existing behavior from #610, orthogonal to the capability plumbing here, and touches the session resolution path — better handled as a focused follow-up that normalizes thinking level once the model alias is resolved.

@changeset-bot

changeset-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 8aa99f0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages
Name Type
@moonshot-ai/kosong Minor
@moonshot-ai/agent-core Minor
@moonshot-ai/kimi-code-sdk Minor
@moonshot-ai/kimi-code-oauth Minor
@moonshot-ai/kimi-code Minor
@moonshot-ai/acp-adapter Patch
@moonshot-ai/migration-legacy Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Jun 10, 2026

Copy link
Copy Markdown
pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@8aa99f0
npx https://pkg.pr.new/@moonshot-ai/kimi-code@8aa99f0

commit: 8aa99f0

7Sageer added 7 commits June 11, 2026 11:10
Some models cannot run with thinking turned off — claude-fable-5 today,
where adaptive thinking is always on. The TUI model selector already
renders an always-on state for aliases whose capabilities include
'always_thinking', but nothing populated it automatically.

- kosong: add ModelCapability.always_thinking; register claude-fable
  prefixes in a dedicated always-thinking capability group; map the
  catalog extension field always_reasoning onto it; export
  getProviderModelCapability for pure (no provider construction)
  capability lookups
- node-sdk: materialize always_thinking when converting catalog models
  to config aliases
- agent-core: enrich runtime model aliases with detected
  always_thinking in loadRuntimeConfig (runtime-only; write-back reads
  the file from disk, so nothing is persisted); pass the field through
  resolveModelCapabilities
Review follow-ups on the always-thinking capability flow:

- capabilities strings are now a refinement, not a replacement:
  always_thinking is always spelled out alongside thinking, in both
  producers (catalog materialization and runtime detection), so
  consumers checking plain 'thinking' membership (e.g. the ACP model
  catalog) keep working
- setKimiConfig strips detected-only capabilities from incoming
  patches before merging onto the on-disk config. Callers commonly
  write back models obtained from getConfig(), whose aliases were
  enriched at load time; persisting those would turn runtime detection
  into a user declaration that outlives future detection fixes.
  Capabilities the user or catalog actually declared on disk are
  preserved verbatim.
Replace the load-time enrich / write-time strip pair with
resolveAliasCapabilities: config objects stay pure declarations, and
detection from kosong's built-in model knowledge merges additively at
read time. Write-back paths persist snapshots verbatim, and future
corrections to the detection knowledge take effect without migrating
materialized state.

- Extract the Claude id parser into claude-version.ts, shared by the
  anthropic wire layer and the capability registry so the advertised
  capability cannot drift from request building; this also covers
  vendor-prefixed Fable ids like us.anthropic.claude-fable-5-v1:0.
- Mark OpenAI o-series and Gemini 2.5 Pro as always_thinking, matching
  their wire behavior (no off switch / minimum thinking budget).
- Let custom api.json registries declare always_reasoning, same
  contract as the models.dev catalog extension.
- Drop the capability probe in provider-manager: session resolution now
  goes through the same resolver instead of constructing a provider.
- Pass undefined directly instead of conditional spread for optional
  always_thinking / providers fields.
- isFableModel now owns the full prefix-or-parser predicate in
  claude-version.ts, so the wire layer (omit thinking: disabled, xhigh
  effort) and the capability registry classify version-less ids like
  claude-fable-latest identically instead of drifting apart.
- ModelCapability.always_thinking is typed `true` instead of boolean:
  absent means toggleable, which removes the === true narrowing at every
  consumer.
- The always-thinking registry constants derive from their toggleable
  base capability via spread, so the only stated difference is the flag.
- resolveAliasCapabilities documents that unknown detection flattens to
  false and that permissive-by-default consumers (the TUI media gate)
  must keep reading declared strings.
- listModelsFromHarness reads config.providers directly (the schema
  defaults it), fixing the one test stub that omitted it.
- Drop unnecessary casts from the model-selector test fixtures.
deriveThinkingSupported turning true for claude-fable-5 (via capability
resolution) had a regression as a side effect: ACP started advertising
an Off/On thinking select for models whose thinking cannot be turned
off — and with defaultThinking unset the toggle even started on
"Thinking Off" while the model reasons (and bills) regardless.

AcpModelEntry now carries alwaysThinking (set only by capability
resolution; the name-regex route cannot tell always-on variants from
toggleable ones and never sets it), and buildSessionConfigOptions omits
the thinking option for such models — the same dynamic-visibility rule
already used for non-thinking models, since ACP's select arm has no
read-only entry.
- listModelsFromHarness no longer dereferences config.providers outside
  its try block: getConfig crosses an RPC/stub boundary where a partial
  snapshot can omit the field the static type marks required, and the
  resulting TypeError escaped the catch. The harness mock goes back to
  omitting providers so the partial-stub path stays exercised.
- The model selector's providers prop narrows to
  Record<string, Pick<ProviderConfig, 'type'>>: the component only reads
  the wire type, so its signature should not claim credentials. Callers
  keep passing their full ProviderConfig map via structural subtyping.
- isFableModel's version-less prefix branch anchors at a separator so
  ids merely containing the substring don't classify as Fable.
- A cross-check test pins getProviderModelCapability to each
  ChatProvider's getCapability for every provider type, so the pure
  lookup and the instance path cannot silently drift apart.
Declared-capability fixtures now use generic provider/model names; which
providers ship which capabilities is the catalog's statement to make,
not the test suite's. Detection-unknown coverage moves to an
uncatalogued model name on a known wire type, which exercises the same
UNKNOWN_CAPABILITY path.
@7Sageer 7Sageer force-pushed the feat-always-thinking-capability branch from c4edb94 to 8aa99f0 Compare June 11, 2026 03:11
@7Sageer 7Sageer marked this pull request as draft June 11, 2026 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant