Skip to content

Bound PTY output queues under backpressure#7089

Open
brennanb2025 wants to merge 2 commits into
brennanb2025/fix-terminal-reliabilityfrom
brennanb2025/reliability-pty-output-perf-pr
Open

Bound PTY output queues under backpressure#7089
brennanb2025 wants to merge 2 commits into
brennanb2025/fix-terminal-reliabilityfrom
brennanb2025/reliability-pty-output-perf-pr

Conversation

@brennanb2025

@brennanb2025 brennanb2025 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR hardens PTY output delivery across daemon stream sockets, main-to-renderer IPC, renderer replay, and runtime/mobile terminal streams. It adds bounded backpressure queues, explicit pending-output caps, active-output priority lanes, and deterministic reliability-gate coverage.

Headline behavior status: implemented for the focused queue/backpressure slice. This does not replace broader terminal perf review or soak testing.

Stack Fit / Relationship to Other PRs

This is the high-value bounded-output/backpressure slice in the terminal reliability stack. It fits alongside #7001, #7008, #7005, #7004, #7006, and #7007 as the queue/IPC/runtime-stream hardening layer: those PRs cover adjacent terminal/session/runtime reliability work, while this PR focuses on stalled consumers, pending live output, and active-output starvation.

Perf review remains separate because these tests prove explicit byte/char/line/count contracts, not real-world keypress-to-paint latency across every TUI/provider/platform combination.

Anti-Regression Guarantees

Not an absolute no-regression guarantee. The evidence proves:

  • daemon stream writes stop after socket.write(false) and resume on drain;
  • daemon backpressure queues are bounded by encoded bytes and line count;
  • flush-immediate output can move ahead of background backlog without unbounded growth;
  • main-to-renderer pending output is capped per PTY and globally in chars;
  • renderer ACK/in-flight counters remain the pressure boundary, including exit flushes;
  • replay snapshots are bounded while preserving FIFO for accepted work and newest-tail behavior under floods;
  • runtime/mobile stream snapshot and pending live-output byte caps remain covered by existing deterministic tests.

ELI5

When terminals print too much text and the receiver is slow, Orca now stops piling up unlimited output. It keeps a bounded recent tail, gives the active terminal a small priority lane, and records exactly when output was dropped because a queue hit its cap.

Performance Proof

The daemon socket queue now honors drain, has both byte and line caps, and trims in one pass over a bounded queue. Main IPC uses rolling pending-output accounting instead of summing every pending PTY on each output event. Exit-time output is capped to one normal slice when the renderer lane has room; otherwise the pending tail is dropped and counted instead of bypassing ACK backpressure.

Perf audit result: initial findings on daemon tiny-line complexity, main pending total scans, and exit flush accounting were fixed and re-reviewed clean.

Usefulness / Material Impact

This makes hidden/background terminal floods less able to starve active terminal redraws or grow memory without bound. It also gives future regressions a focused gate with byte/char/line/drop evidence instead of relying on broad terminal flows.

Validation

  • pnpm exec vitest run --config config/vitest.config.ts src/main/daemon/daemon-stream-data-batcher.test.ts
  • pnpm exec vitest run --config config/vitest.config.ts src/main/ipc/pty.test.ts --testNamePattern "renderer backpressure|pending renderer output|interactive output|active PTY|in-flight output|PTY exit"
  • pnpm exec vitest run --config config/vitest.config.ts src/renderer/src/components/terminal-pane/pty-connection.test.ts --testNamePattern "remote replay"
  • pnpm exec vitest run --config config/vitest.config.ts src/main/runtime/rpc/terminal-output-batching.test.ts src/main/runtime/rpc/terminal-subscribe-buffer.test.ts src/main/runtime/rpc/terminal-multiplex.test.ts
  • pnpm run check:reliability-gates
  • pnpm run typecheck:node
  • pnpm run typecheck:web
  • pnpm run typecheck:cli
  • git diff --check

Design review: completed; doc wording was corrected to distinguish main IPC char caps from runtime byte caps. Completeness/code review: completed; code review found no issues. Perf review: completed after fixes and re-review.

Residual Gaps

  • The new gate is experimental, not blocking.
  • This is deterministic unit/integration evidence, not real SSH, WSL, relay, or physical mobile soak.
  • Hidden-output renderer wake counts, scheduler queue depth, event-loop delay, and real TUI keypress-to-paint latency remain perf-soak follow-ups.
  • No Playwright test was added because the requested oracles are lower-level queue/ACK contracts and no visible UI changed.

Merge Order

Open against brennanb2025/fix-terminal-reliability. This can merge with the terminal reliability stack once CI/review are clean, but it should not be treated as a substitute for the separate perf review/soak work.

Post-PR Stabilization

Pending automated checks and review feedback after PR creation.

Made with Orca 🐋

brennanb2025 and others added 2 commits July 2, 2026 01:31
Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Orca <help@stably.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant