feat: opt-in --selector review mode (in-session classify + adjudicate, zero spawns) by bborbe · Pull Request #51 · bborbe/coding

bborbe · 2026-06-11T17:01:11Z

Summary

Adds an opt-in --selector mode to /coding:pr-review and /coding:code-review that replaces the per-owner Task fanout (Step 4b-ii) with two in-session steps:

Step 4c-sel CLASSIFY — narrows the Step 4b-i glob candidates under a recall-optimized contract (include when uncertain; skip reasons must be justified against the rule's applies_when, never the rule-id name). Hard invariant: applicable ⊆ candidates.
Step 4d-sel ADJUDICATE — reads only the applicable rules' ### RULE blocks, judges the full diff + mechanical findings, emits findings in the existing severity buckets. Batches into 2-3 in-session passes when applicable set > 20.

Default mode is byte-for-byte unchanged — pure insertions; selector is opt-in until validated further (later migration step flips the default).

Spec: specs/selector-mode-classify-adjudicate.md (audited; in this PR). Architecture: AI-Selector Review Redesign (vault).

Validation (live end-to-end, isolated CLAUDE_CONFIG_DIR)

Ran claude -p "/coding:pr-review master selector" against the perpetual fixture bborbe/maintainer#2, diffed against the frozen legacy golden (13 findings from real per-owner dispatch):

Check	Result
All 13 golden rule_ids accounted for	✅ 7 findings reproduced + 3 legacy false positives correctly rejected + 2 validly excluded
Zero sub-agent spawns (Task tool was available)	✅ transcript has no `subagent_type` events
Narrowing invariant + real narrowing	✅ 31 candidates → 9 applicable → 22 skipped with applies_when-grounded reasons
Wall time / cost	✅ 3m24s / $1.18 (legacy bot: 12-15 min on a smaller PR)
`make precommit`	✅ 13/13; scenarios untouched

Round 1 caught a real classify false negative (go-testing/libtime-injection-required skipped as "not a test file" — rule-id-prefix inference); fixed by the skip-justification rule (2642626), round 2 clean.

Incidental discovery: ast-grep-runner.sh scans .git/COMMIT_EDITMSG (274 stale agent-auditor findings) — follow-up, not addressed here.

🤖 Generated with Claude Code

…cate, zero sub-agent spawns)

…never the rule-id name

… released section

…o, resolve validator path via plugin-root chain, mark canonical/sibling selector sections

ben-s-pull-request-reviewer

Review Summary

Step	Status
Worktree setup	ALREADY_AT_HEAD — used cwd as REVIEW_DIR
Diff	5 files / +382 lines: `CHANGELOG.md`, `commands/code-review.md`, `commands/pr-review.md`, `specs/fixtures/golden-legacy-verdict.json`, `specs/selector-mode-classify-adjudicate.md`
Project detection	Plugin repo (no go.mod / pyproject.toml); review covers markdown+specs only
Mechanical funnel (Step 4a)	unavailable in this environment — `scripts/ast-grep-runner.sh` was blocked from execution by the Bash sandbox; all changed files are markdown/specs (no `.go` / `.py` surface), so the YAML funnel would have zero hits anyway. Proceeded to Step 4b with judgment-rule triggers only.
Active judgment rules (Step 4b-i)	`agent-cmd/command-frontmatter` (slash-command-auditor), `agent-cmd/command-thin` (slash-command-auditor), `changelog/conventional-prefix-required` (agent-auditor)
Owners spawned	`coding:slash-command-auditor`, `coding:agent-auditor` (concurrency = 2)
Owner reports	Both: zero findings. Frontmatter is unchanged; new body prose follows the existing dispatcher pattern. The `feat:` prefix matches the changelog rule's allowed set.
Citation validation (Step 4d)	Skipped — no agent findings to validate. Manual check: all 13 `rule_id`s in the golden baseline resolve in `rules/index.json`; the new selectors do not introduce any new `rule_id`s.
precommit (Step 3b)	Skipped (standard mode) — CI covers lint+test

Manual Review Notes (diff-only)

The selector section is duplicated across both command files with explicit CANONICAL / SIBLING HTML comments and a "diff to verify" instruction. Sibling drift is structural, not introduced by this PR.
The HARD INVARIANT and the architecture-tier bypass are well-documented; the LLM-discipline surface is acknowledged in the spec (non-goal: no machine check before adjudication).
The golden baseline is checked in at specs/fixtures/golden-legacy-verdict.json; all 13 rule_ids are valid against rules/index.json (citation-validated by hand).
The 9 plan concerns from the task were reviewed; 4 are addressed by the spec's prose and 5 are explicitly deferred to the default-flip migration step (sibling extraction, schema validation of the golden, tunable batching, etc.).

{
  "verdict": "approve",
  "summary": "Opt-in `--selector` mode is well-bounded: default behavior is byte-for-byte preserved, mechanical funnel and judgment-rule dispatch still run, golden baseline is committed and citation-validates against `rules/index.json`, and both per-owner adjudicators (slash-command-auditor, agent-auditor) returned zero findings. Optional nits flagged below; none rise to blocking.",
  "comments": [
    {
      "file": "commands/code-review.md",
      "line": 189,
      "severity": "nit",
      "message": "SIBLING step 4c-sel Input says \"diff from the current working tree (`git diff HEAD~1`)\" but the prose two lines above also says \"the working-tree diff from `git diff HEAD~1` (or the directory diff as parsed in Step 1)\". The canonical in pr-review.md has no \"or\" ambiguity — it names a single source (the Step 0c diff). Consider dropping the parenthetical so the sibling is unambiguous."
    },
    {
      "file": "commands/pr-review.md",
      "line": 236,
      "severity": "nit",
      "message": "Short-circuit sentinel `selector clean — no adjudication needed` is encoded with a literal em-dash (U+2014). AC6's grep and the spec's verification block both depend on this exact glyph; any editor that normalizes em-dash to ASCII `--` will silently break AC6 while leaving the feature working. Consider switching to ASCII `--` or quoting the literal in a code fence inside the verification block."
    },
    {
      "file": "commands/code-review.md",
      "line": 207,
      "severity": "nit",
      "message": "Same em-dash fragility as the canonical — the SIBLING copy mirrors the literal U+2014 character. Drift between the two files on this character will be invisible to diff tooling and will fail AC6 differently in each file."
    },
    {
      "file": "README.md",
      "line": 0,
      "severity": "nit",
      "message": "`--selector` mode is documented inside the two command files but not surfaced in `README.md` (commands table) or `llms.txt`. The spec does not require it; adding a one-line mention under each command's row would make the opt-in discoverable without reading the dispatcher prose."
    }
  ],
  "concerns_addressed": [
    "correctness: HARD INVARIANT `applicable ⊆ candidates` is documented in the prose at commands/pr-review.md:230 and commands/code-review.md:201 and the spec names it as a structural guarantee; not addressed mechanically (acknowledged non-goal)",
    "correctness: architecture-tier bypass SRP/layering clause is descriptive prose in both files — the LLM must infer from the rule's doc_path heading; spec marks this as the intended contract (acknowledged)",
    "correctness: sibling drift is bounded by CANONICAL/SIBLING HTML comments at commands/pr-review.md:216 and commands/code-review.md:185, plus a 'diff to verify' instruction; single-file extraction is explicitly deferred to the default-flip migration step (addressed in spec, raised as nit on the Input prose)",
    "correctness: mode argument parsing accepts both `--selector` and `selector` in commands/pr-review.md:83 and commands/code-review.md:24 (the spec is silent on partial-token handling, but exact-match is the natural read of the prose)",
    "correctness: em-dash in short-circuit string — raised as nit (no change to the literal)",
    "tests: no new acceptance scenario — addressed by spec's Non-Goals and the AC3 contract that default-mode scenarios 002/004 stay green",
    "tests: golden baseline has no schema validation — addressed by spec's Constraints section and the deterministic severity map; raised as a gap the default-flip step will close",
    "correctness: >20-rule batching split is hand-waved in prose (commands/pr-review.md:265 / commands/code-review.md:236) — spec marks a tunable threshold as a non-goal (deferred)",
    "security: hallucinated file:line risk — the spec's Trust-boundary section already names `validate-citations.sh` as the gating check (same risk as legacy path); spec is silent on a post-validation sanity check (deferred)",
    "performance: 3m24s / $1.18 claim is based on a single fixture run — spec marks further measurement as a follow-up; raised as a gap the default-flip step will close"
  ]
}

ben-s-pull-request-reviewer

I have the agent reports. Let me consolidate the findings and produce the final report.

The findings are:

agent-auditor (8 active rules): All PASS — no violations
slash-command-auditor (2 active rules): 1 Should Fix + 2 Must Fix + 1 Should Fix

The two "Must Fix" findings from slash-command-auditor concern extracting inline selector logic to a new agent. The spec explicitly documents that selector mode runs in-session with zero spawns by design — extracting to a sub-agent would change the semantics. The spec also acknowledges the duplication is a known limitation with a planned cleanup. The argument-hint issue is a real, actionable bug.

Per the procedure's deterministic map, any Must Fix → "request-changes". I'll produce the verdict accordingly.

{
  "verdict": "request-changes",
  "summary": "Documentation-only change adding the opt-in `--selector` mode to the two review commands plus a dark-factory spec and golden fixture. Mechanical funnel was unavailable in this environment, so judgment-only adjudication ran. agent-auditor found zero violations across 8 active rules. slash-command-auditor found a real `argument-hint` documentation drift (the new `--selector` mode token is not listed in either command file's hint) and flagged that Steps 4c-sel/4d-sel inline business logic that `command-thin` would normally push to an agent — this is by design (the spec mandates zero sub-agent spawns in selector mode and acknowledges the duplication with a planned cleanup), but the argument-hint bug is straightforward to fix.",
  "comments": [
    {
      "file": "commands/code-review.md",
      "line": 3,
      "severity": "major",
      "message": "[agent-cmd/command-frontmatter] `argument-hint` is `\"[short|standard|full] [directory]\"` but the new `--selector` / `selector` mode token (parsed at line 23) is missing. Update to `\"[short|standard|full|selector] [directory]\"`. Rule agent-cmd/command-frontmatter (MUST)."
    },
    {
      "file": "commands/pr-review.md",
      "line": 3,
      "severity": "major",
      "message": "[agent-cmd/command-frontmatter] `argument-hint` is `\"<target-branch> [short|standard|full]\"` but the new `--selector` / `selector` mode token (parsed at line 83) is missing. Update to `\"<target-branch> [short|standard|full|selector]\"`. Rule agent-cmd/command-frontmatter (MUST)."
    },
    {
      "file": "commands/code-review.md",
      "line": 183,
      "severity": "critical",
      "message": "[agent-cmd/command-thin] Steps 4c-sel/4d-sel inline six categories of business logic in the command: classify contract (recall-optimized decision policy), skip-justification rule (justify against `applies_when`, never the rule-id name), architecture-tier bypass, adjudication severity rubric, batching threshold (>20 rules splits into 2-3 passes), and short-circuit contract. The spec mandates zero sub-agent spawns in selector mode, so the in-session prose is the design intent — but the rule's intent is to keep commands as thin dispatchers, and the pre-existing files were already over 100 lines before this 70-line addition (now 317). The spec acknowledges the duplication: \"Single-file extraction is planned for the default-flip migration step.\" Either extract the logic to a `coding:selector-classifier` agent (changes the in-session semantic — may be acceptable for the opt-in path) or amend `agent-cmd/command-thin` to carve out an exception for opt-in in-session algorithms. Rule agent-cmd/command-thin (MUST)."
    },
    {
      "file": "commands/pr-review.md",
      "line": 214,
      "severity": "critical",
      "message": "[agent-cmd/command-thin] Same finding as code-review.md:184 — the canonical selector section duplicates six categories of business logic that `command-thin` says belong in an agent. pr-review.md is now 334 lines (was 267); the selector section accounts for ~20%. The spec's Non-goals explicitly preserve this in this migration step; tracked for the default-flip cleanup. Rule agent-cmd/command-thin (MUST)."
    },
    {
      "file": "commands/code-review.md",
      "line": 185,
      "severity": "major",
      "message": "[agent-cmd/single-source-of-truth] The selector section is duplicated across the two command files (~60 lines each), maintained by comment instructions (\"edit there first, mirror here, diff to verify\"). The HTML comment is a workaround that does not actually enforce the invariant — drift would only be caught at PR review. A single-source solution (extracted agent, or a single include) would remove the maintenance burden. Rule agent-cmd/single-source-of-truth (SHOULD)."
    }
  ],
  "concerns_addressed": [
    "correctness: in-session classify + adjudicate path preserves dedup/confidence semantics — addressed structurally by the narrowing invariant (`applicable ⊆ candidates`); cited in spec lines 23, 49, and in commands/pr-review.md:228 / commands/code-review.md:198",
    "correctness: --selector option short-circuits spawns only when explicitly enabled — addressed: parser at commands/pr-review.md:83 and commands/code-review.md:23 recognizes the token; default path is byte-for-byte unchanged per the spec's 'MUST remain byte-for-byte as today' constraint",
    "correctness: new types/constants for selector mode are additive — no JSON/markdown report schema breakage (the selector path reuses the existing Must/Should/Nice buckets per the spec's severity vocabulary constraint)",
    "tests: regression coverage on the existing spawn path — addressed via spec AC3 (default-mode scenarios 002/004 unchanged, `make check-acceptance` exits 0)",
    "tests: flag-parsing test for --selector — NO TEST WAS ADDED. The spec mentions manual fixture walk (AC1) and `make precommit` (AC5) but the diff adds no automated test that asserts zero-spawn behavior on --selector. AC2 is verifiable from transcript alone, not from a test fixture. This is a gap relative to the concern.",
    "performance: zero-spawn path actually avoids subprocess/agent calls — addressed structurally: spec lines 18-19, commands/pr-review.md:218 and commands/code-review.md:187 both state 'zero sub-agent spawns', and the citation validation explicitly bypasses coding:simple-bash-runner in favor of a plain bash call (commands/pr-review.md:267-272, commands/code-review.md:238-243)"
  ]
}

…er v0.19.0

….md — commands carry thin pointers (single source of truth)

…ING echo + Must Fix stop on missing)

bborbe added 4 commits June 11, 2026 18:43

feat: add opt-in --selector review mode (in-session classify + adjudi…

f367f41

…cate, zero sub-agent spawns)

fix: classify skip decisions must be justified against applies_when, …

2642626

…never the rule-id name

Merge origin/master (v0.18.0) — keep Unreleased selector bullet above…

d390e5d

… released section

fix: address local pre-review findings — commit golden fixture in-rep…

81e3021

…o, resolve validator path via plugin-root chain, mark canonical/sibling selector sections

bborbe marked this pull request as ready for review June 11, 2026 17:07

ben-s-pull-request-reviewer Bot approved these changes Jun 11, 2026

View reviewed changes

ben-s-pull-request-reviewer Bot requested changes Jun 11, 2026

View reviewed changes

bborbe added 2 commits June 11, 2026 19:50

fix: add selector token to argument-hint in both commands; merge mast…

6e27795

…er v0.19.0

refactor: extract selector-mode procedure to docs/selector-mode-guide…

c99cfff

….md — commands carry thin pointers (single source of truth)

bborbe marked this pull request as draft June 11, 2026 18:01

fix: fail-fast guide resolution in selector mode (GUIDE_OK/GUIDE_MISS…

acd75d2

…ING echo + Must Fix stop on missing)

bborbe mentioned this pull request Jun 11, 2026

feat: MiniMax-M2.7-highspeed for agents + selector review mode bborbe/maintainer#59

Merged

bborbe marked this pull request as ready for review June 11, 2026 18:53

bborbe merged commit 6c4d9a3 into master Jun 11, 2026
1 check passed

bborbe deleted the feature/selector-review branch June 11, 2026 18:53

bborbe mentioned this pull request Jun 11, 2026

feat: selector-mode acceptance contracts (scenarios 005/006, acceptance section 5, runner .git fix) #53

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: opt-in --selector review mode (in-session classify + adjudicate, zero spawns)#51

feat: opt-in --selector review mode (in-session classify + adjudicate, zero spawns)#51
bborbe merged 7 commits into
masterfrom
feature/selector-review

bborbe commented Jun 11, 2026

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bborbe commented Jun 11, 2026

Summary

Validation (live end-to-end, isolated CLAUDE_CONFIG_DIR)

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

Review Summary

Manual Review Notes (diff-only)

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant