Skip to content

Add a no-worktree workspace mode#2

Open
nbashaw wants to merge 1 commit intodanshapiro:mainfrom
nbashaw:feat/no-worktree-flag
Open

Add a no-worktree workspace mode#2
nbashaw wants to merge 1 commit intodanshapiro:mainfrom
nbashaw:feat/no-worktree-flag

Conversation

@nbashaw
Copy link
Copy Markdown

@nbashaw nbashaw commented Mar 12, 2026

Summary

  • add a --no-worktree path to the top-level trycycle skill so it can reuse the current checkout or a Conductor workspace instead of always creating a nested git worktree
  • update prompt templates, docs, and explorer samples to describe the implementation workspace consistently in both modes
  • make the finishing flow workspace-aware and make maintenance/import-skills.sh portable on macOS Bash 3.2

Verification

  • bash -n maintenance/import-skills.sh
  • rendered prompt templates successfully with /opt/homebrew/bin/python3.13 orchestrator/prompt_builder/build.py

Notes

  • bash maintenance/import-skills.sh is still blocked by an expired Claude OAuth token (401 authentication_error), so I manually synchronized the generated planning/finishing subskills in this branch after updating the maintenance inputs
  • the existing repo test file was not runnable here: pytest is not installed in the available interpreters, and python3.13 -m unittest discover -s tests -v fails because tests/test_validate_rendered_prompt.py hardcodes /home/user/code/trycycle

christianbauer-deltaorbit pushed a commit to christianbauer-deltaorbit/trycycle that referenced this pull request Apr 26, 2026
Sets explicit `# trycycle-step: timeout-seconds: N` headers on the 10
subagent templates listed in the spec. Together with the new lifesigns
subprocess probe (de68540) and the streaming runner (b2fa00b), the
default-everywhere posture is replaced with empirically-justified
per-template caps. Long healthy steps now have the headroom they
legitimately need, and short steps fail fast when they over-explore.

Values (matching the spec's recommendations and rationale):

  template                                       seconds  category
  ----------------------------------------------------------------
  prompt-executing-finalize.md                   14400    long-iter
  prompt-executing-next-task.md                  10800    long-iter
  prompt-test-strategy.md                         3600    reasoning
  prompt-test-plan.md                             3600    reasoning
  prompt-post-impl-review.md                      3600    reasoning
  prompt-planning-initial-scaffold.md             1800    creative
  prompt-planning-initial-detail.md               1800    creative
  prompt-planning-initial-commit.md               1800    creative
  prompt-planning-edit.md                         1800    review
  prompt-planning-initial-survey.md                600    deterministic

Templates left without a header:
- prompt-executing-load.md (load is fast checklist work; not in spec
  scope but a candidate for a follow-up tight-cap of ~600 s).
- prompt-executing.md and prompt-planning-initial.md (legacy single-
  shot templates used via --single-shot; fall through to the runner's
  EXECUTING_TIMEOUT_SECONDS / DEFAULT_TIMEOUT_SECONDS).

Tests:
- tests/test_step_timeout_header.py:
  - replaces the narrower test_executing_next_task_template_declares_
    long_timeout (which asserted both at 7200) with a comprehensive
    test_shipped_templates_declare_expected_per_step_timeouts that
    audits all 10 headered templates against an inline expected-value
    table. Updating the map deliberately requires touching this test.
  - adds test_header_wins_over_explicit_cli_timeout_seconds: dispatches
    a fixture with header=1800, asserts header wins both with and
    without --timeout-seconds 600 on the CLI; also asserts a
    headerless template + --timeout-seconds 600 uses 600. The test's
    failure-message names the resolver behaviour so a future flip is
    a deliberate, documented change.

SKILL.md adds a "Per-step timeouts (template headers)" subsection
under "Sequenced phases" documenting the convention and the
how-to-choose-a-value note (deterministic vs creative vs reasoning vs
long-iterative).

Spec discrepancy worth flagging:
- The spec's acceptance criterion danshapiro#2 asserts "the CLI override takes
  precedence over the header" and the spec's regression danshapiro#3 asserts
  CLI=600 + header=1800 → 600 (CLI wins). That contradicts the
  resolver shipped in commit 1987c89, where the HEADER wins and CLI
  is fallback. Per the spec's primary directive ("Pure template edits
  — no orchestrator code changes") I have NOT flipped the precedence.
  The new test instead codifies the actual current behaviour with a
  failure message that flags the discrepancy. If the spec's intended
  precedence (CLI wins) is the true target, that's a 1-line resolver
  change in _resolve_step_timeout_seconds plus updating two existing
  tests' assertions — happy to do that as a follow-up.

Regression: 205 -> 206 tests passing; 5 skipped unchanged.

https://claude.ai/code/session_01XV4wwck5YZ9fxabteSeSk2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant