Skip to content

Expand provider session replay ownership gate#7008

Open
brennanb2025 wants to merge 3 commits into
brennanb2025/fix-terminal-reliabilityfrom
brennanb2025/reliability-provider-ownership
Open

Expand provider session replay ownership gate#7008
brennanb2025 wants to merge 3 commits into
brennanb2025/fix-terminal-reliabilityfrom
brennanb2025/reliability-provider-ownership

Conversation

@brennanb2025

@brennanb2025 brennanb2025 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Thread provider-session identity through queued resume startup commands, pending launch-config registration, effective spawn launch configs, and cold-restore resume registration so staged resumes claim ownership until hooks confirm or correct them.
  • Expand agent-session.provider-ownership coverage with queued startup, pending registration, delayed/wrong hook, live hook, display-only non-claim, duplicate selection, launch-config, and WSL quoting cases.
  • Update the reliability gate oracle, command list, red/green note, and perf note while keeping maturity at soak.

Stack Fit / Relationship to Other PRs

This PR is a child of #7001 and depends on the reliability-gate manifest and validator added there. It narrows in on the agent-session.provider-ownership gate: the parent PR creates the reliability proof process, while this PR strengthens the provider-session ownership invariant and its targeted renderer-state evidence.

It is parallel to #7004, #7005, #7006, and #7007. Those PRs cover other reliability surfaces; this one is only the provider-session replay ownership surface and should not broaden product scope beyond duplicate resume prevention.

Anti-Regression Guarantees

This PR does not claim an absolute no-regression guarantee. The evidence-backed invariant is narrower: workspace activation, launch, restore, sleep, hibernate, dedupe, clearing, and reconnect code must not replay or resume a provider session id already owned, queued, pending, retained by a preserved pane, hidden/inactive but still owned by the workspace, or live in that workspace.

The targeted gate proves provider-session claim keys are owned by preserved active tabs, inactive split leaves, visible non-focused split groups, live hook records, queued startup commands, pending launch-config registrations, quit records, and worktree-sleep records. It also proves display-only or replay-looking evidence without provider-session identity does not claim ownership, and duplicates clear without launching a second resume command.

Red/green evidence is partial but targeted: removing the queued/pending claim scan or failing to thread provider-session identity through staged resumes allows a second resume tab for an already staged provider session. With this PR, queued, pending, retained/preserved, hidden-or-inactive owned, and live ownership evidence block replay, while wrong-session or display-only evidence does not.

ELI5

An agent session is like a named conversation the provider already knows about. If Orca already has that conversation open, queued to open, or in the middle of opening, workspace activation should not start that same conversation a second time. This PR teaches the resume path to count those in-between states as ownership and adds tests so an already-owned provider session is not replayed.

Performance Proof

Perf risk was reviewed separately from code review. The ruled-out risk is adding provider polling, remote filesystem probing, subprocess churn, hidden-pane wake loops, startup awaits, or repeated provider listing to answer ownership during activation.

The implementation stays state-driven and bounded: once per activation it scans the target worktree sleeping records plus the renderer maps pendingStartupByTabId, agentLaunchConfigByPaneKey, and agentStatusByPaneKey. It does not add provider I/O, background renderer output work, retry loops, fixed sleeps, or hidden-pane wakeups. The manifest keeps a performance budget on this gate so any future PR that adds ownership scans must keep proving bounded work before promotion.

Usefulness / Material Impact

This reduces the escaped provider-session replay class where one provider session can be represented by an active tab, inactive tab, hidden split pane, queued startup, pending launch config, live status hook, quit record, or worktree-sleep record. Without recognizing those ownership states, workspace activation can create a duplicate resume tab for a session Orca already owns or is already staging.

The practical impact is fewer duplicate agent resumes after sleep, hibernate, activation, or reconnect, without treating title text, replay-looking command strings, or display-only evidence as authority.

Validation

  • PASS: git fetch origin main
  • PASS: inspected current PR Expand provider session replay ownership gate #7008 body, diff, and checks with gh pr view 7008 --json ..., gh pr checks 7008, gh pr diff 7008 --name-only, and git diff origin/brennanb2025/fix-terminal-reliability..HEAD
  • PASS: PR checks are green: verify and wayland terminal input
  • PASS: pnpm exec vitest run --config config/vitest.config.ts src/renderer/src/lib/resume-sleeping-agent-session.test.ts src/renderer/src/lib/resume-sleeping-agent-session-provider-claims.test.ts (37 tests)
  • PASS: pnpm run check:reliability-gates
  • PASS: pnpm exec oxlint config/reliability-gates.jsonc config/scripts/check-reliability-gates.mjs config/scripts/check-reliability-gates.test.mjs docs/reference/reliability-gates-implementation-plan.md src/renderer/src/components/terminal-pane/pty-connection-types.ts src/renderer/src/components/terminal-pane/pty-connection.test.ts src/renderer/src/components/terminal-pane/pty-connection.ts src/renderer/src/lib/resume-sleeping-agent-session-provider-claims.test.ts src/renderer/src/lib/resume-sleeping-agent-session.test.ts src/renderer/src/lib/resume-sleeping-agent-session.ts src/renderer/src/lib/sleeping-agent-pane-ownership.ts src/renderer/src/store/slices/terminals.ts
  • FAIL (unrelated): pnpm run typecheck:tsc stops at src/renderer/src/components/terminal-pane/useSessionRestoredBannerDismiss.test.tsx:27 with TS2740: Type Element is missing HTMLDivElement properties.
  • FAIL (unrelated): pnpm run lint reaches localization verification and reports missing ephemeral-VM localization keys in files outside this PR, including src/renderer/src/components/NewWorkspaceComposerCard.tsx, src/renderer/src/components/settings/EphemeralVmRecipeRow.tsx, src/renderer/src/components/settings/EphemeralVmRuntimesSection.tsx, src/renderer/src/components/settings/EphemeralVmsPane.tsx, src/renderer/src/components/settings/Settings.tsx, src/renderer/src/components/settings/ephemeral-vms-search.ts, src/renderer/src/components/worktree-creation/WorktreeCreationPanel.tsx, src/renderer/src/hooks/useSettingsNavigationMetadata.ts, and src/renderer/src/lib/sidebar-worktree-activation.ts.

Residual Gaps

  • The gate remains renderer-state coverage rather than a broad Electron repeat-activation loop; the manifest keeps that as a known gap.
  • Red/green evidence is still partial, not complete saved CI red/green history.
  • Manifest maturity did not change (soak), so this PR strengthens coverage but does not promote the gate to blocking.
  • Live SSH, WSL, remote-runtime, mobile/relay, macOS, Linux, and Windows provider paths are covered here by state/identity contracts and WSL quoting tests, not by a live provider/platform matrix run.
  • Local broad typecheck:tsc and lint still have unrelated repository blockers listed above, even though PR CI checks are passing.

Merge Order

Requires #7001 first because this PR depends on the reliability gate manifest and validator added there.

After #7001 lands, this PR can merge independently of #7004, #7005, #7006, and #7007. No other child PR must merge before this one. Later child PRs may need a normal manifest rebase if this one lands first.

Post-PR Stabilization

  • CI checks pass for the current PR head after the post-PR wait.
  • PR comments and review-comment surfaces were checked; no actionable automated review comments remain.

Made with Orca


Open in Stage

Co-authored-by: Orca <help@stably.ai>
@stage-review

stage-review Bot commented Jul 1, 2026

Copy link
Copy Markdown

Ready to review this PR? Stage has broken it down into 6 individual chapters for you:

Title
1 Add provider session identity to terminal state
2 Thread provider identity through PTY connections
3 Implement queued and pending ownership scans
4 Expand resume gate with ownership claims
5 Refactor and expand provider claim tests
6 Update reliability gate manifest
Open in Stage

Chapters generated by Stage for commit fc3299f on Jul 1, 2026 10:45pm UTC.

@brennanb2025

Copy link
Copy Markdown
Contributor Author

Hard-review follow-up: independent ownership review found provider-session ownership gaps. Pushed c37c49a to preserve providerSession through effective/cold-restore launch-config registration and to ignore stale orphan live-status claims while still trusting existing live tabs. Validation: focused resume/session provider Vitest passed (37 tests), targeted pty-connection provider/cold-restore tests passed, reliability gate manifest check passed, touched-file oxlint passed, and git diff --check passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant