Skip to content

fix: agentcore dev orphaned processes#1438

Merged
jariy17 merged 3 commits into
aws:mainfrom
avi-alpert:aalpert/fix-dev-orphan-process
Jun 2, 2026
Merged

fix: agentcore dev orphaned processes#1438
jariy17 merged 3 commits into
aws:mainfrom
avi-alpert:aalpert/fix-dev-orphan-process

Conversation

@avi-alpert
Copy link
Copy Markdown
Contributor

@avi-alpert avi-alpert commented Jun 2, 2026

Description

When agentcore dev is killed programmatically (SIGTERM/SIGKILL), the underlying uvicorn reloader and worker processes are not terminated. They get reparented to PID 1 and continue holding the port. Subsequent agentcore dev runs then fall back to a different port, leaking processes with each run.

Interactive Ctrl+C (SIGINT) was unaffected since it hits the entire process group and was already handled. This fix addresses only the programmatic kill $PID (SIGTERM) path used by scripts, CI, and agent tooling.

Root cause: Two issues combined:

  1. Only SIGINT was handled — SIGTERM (from kill $PID) exited without triggering cleanup.
  2. child.kill('SIGTERM') only signals the direct child PID. uvicorn --reload forks a reloader that spawns a worker, so grandchild processes survived.

Fix:

  • Spawn the child with detached: true (POSIX) so it gets its own process group.
  • Kill the entire process tree: process.kill(-pid, signal) on POSIX, taskkill /pid <pid> /T /F on Windows.
  • Add SIGTERM handling to all three dev server code paths: --logs and browser/web-UI use onShutdownSignal() (SIGINT + SIGTERM), TUI adds SIGTERM only (Ink handles SIGINT).
  • Extract onShutdownSignal() helper to avoid duplicating signal registration.

Related Issue

Closes #1440

Documentation PR

N/A

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Other (please describe):

Testing

Manually verified on macOS and Windows:

# macOS/Linux
agentcore dev --logs &
DEV_PID=$!
sleep 5
kill $DEV_PID
wait $DEV_PID 2>/dev/null
sleep 1
lsof -ti :8080  # No orphaned processes

Manually tested on windows by running agentcore dev and then Stop-Process on the node process, making sure no child processes remained.

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint
  • If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

@avi-alpert avi-alpert requested a review from a team June 2, 2026 17:56
@github-actions github-actions Bot added the size/s PR size: S label Jun 2, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 2, 2026
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Package Tarball

aws-agentcore-0.16.0.tgz

How to install

gh release download pr-1438-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.16.0.tgz

Copy link
Copy Markdown

@agentcore-cli-automation agentcore-cli-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling the orphaned process bug. The fix on Linux/macOS is sound — detached: true puts the child in its own process group and process.kill(-pid, signal) signals the whole tree, which is exactly what's needed for uvicorn --reload's reloader/worker pair.

However, this approach is broken on Windows, which src/lib/utils/platform.ts and AGENTS.md indicate is a supported platform. Two issues called out inline:

  1. detached: true on Windows spawns the child in a new console window (visible flash, separate console).
  2. Windows has no POSIX process groups — process.kill(-pid, ...) will throw EINVAL/ESRCH, the catch falls back to child.kill(signal), and child.kill on Windows does not kill the process tree either. So on Windows we'd be back to orphaning subprocesses — same bug this PR is fixing.

The usual fix is to gate the new behavior on !isWindows and use taskkill /pid <pid> /T /F (or the tree-kill package) for Windows process tree termination.

Comment thread src/cli/operations/dev/dev-server.ts Outdated
Comment thread src/cli/operations/dev/dev-server.ts
Comment thread src/cli/commands/dev/command.tsx
@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 2, 2026
@avi-alpert avi-alpert marked this pull request as draft June 2, 2026 18:26
@github-actions github-actions Bot added size/s PR size: S and removed size/s PR size: S labels Jun 2, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@avi-alpert avi-alpert marked this pull request as ready for review June 2, 2026 18:51
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 2, 2026
Copy link
Copy Markdown

@agentcore-cli-automation agentcore-cli-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix — the POSIX-side approach (detached: true + process.kill(-pid, sig)) is exactly right for uvicorn --reload's reloader/worker pair, and the Windows path via taskkill /T /F covers the cross-platform gap raised in earlier review threads.

Most of my concerns were already covered in those threads (Windows handling, the this.child! footgun, the SIGTERM-only handler in TUI mode). One additional test-correctness item below — non-blocking for CI (since CI only runs on ubuntu-latest) but worth fixing for parity with codezip-dev-server.test.ts, which already mocks isWindows.

Comment thread src/cli/operations/dev/__tests__/dev-server.test.ts
@github-actions github-actions Bot added size/s PR size: S and removed agentcore-harness-reviewing AgentCore Harness review in progress size/s PR size: S labels Jun 2, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@avi-alpert avi-alpert force-pushed the aalpert/fix-dev-orphan-process branch from 71ef549 to 5b4a8f4 Compare June 2, 2026 20:29
@github-actions github-actions Bot added size/s PR size: S and removed size/s PR size: S labels Jun 2, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 2, 2026
Copy link
Copy Markdown
Contributor

@jariy17 jariy17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@jariy17 jariy17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jariy17 jariy17 merged commit c9d78ea into aws:main Jun 2, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/s PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agentcore dev can leave orphaned processes occupying ports

3 participants