fix(shared-runtime): fork recovery, shutdown, runtime_handle#1986
fix(shared-runtime): fork recovery, shutdown, runtime_handle#1986iunanua wants to merge 2 commits into
Conversation
Three issues: - after_fork_parent / after_fork_child used `?` inside the restart loop, so a single InvalidState worker silenced every other component in the fork child. Now log-and-continue, matching before_fork. - spawn_worker after shutdown silently registered a dead worker: there was no shutdown state distinct from "no runtime yet". Add `shutdown: AtomicBool`, set it in shutdown / shutdown_async, and check it under the workers lock in spawn_worker so the during-shutdown race is also covered. New SharedRuntimeError::AlreadyShutdown + FFI mapping. - runtime_handle() handed callers an owned tokio Handle that bypassed every fork-safety guarantee the crate exists to provide. Replace with with_runtime_context(closure), which scopes the entered runtime to a synchronous closure; migrate the lone production caller in the trace exporter builder. Regression tests for the first two are un-ignored and assert the correct behavior. The spawn_worker/before_fork stress test for the previously-fixed TOCTOU stays #[ignore]'d for regression hunting. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e0908e64b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /// The runtime is not available or in an invalid state. | ||
| RuntimeUnavailable, | ||
| /// Operation rejected because the runtime has already been shut down. | ||
| AlreadyShutdown, |
There was a problem hiding this comment.
Preserve existing FFI error-code discriminants
Because this #[repr(C)] enum is exposed through the C ABI, inserting AlreadyShutdown here renumbers LockFailed and every following error code. In deployments where a caller is compiled against the previous generated header (or compares/stores the numeric code values), the same library errors will be misclassified after upgrading the shared library. Please append the new variant or assign explicit discriminants to keep the existing values stable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e0908e64b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
💡 Codex ReviewBecause this ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
1 similar comment
💡 Codex ReviewBecause this ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
🔒 Cargo Deny Results📦
|
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 6f55346 | Docs | Datadog PR Page | Give us feedback! |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1986 +/- ##
==========================================
- Coverage 72.63% 72.62% -0.01%
==========================================
Files 451 451
Lines 74687 74734 +47
==========================================
+ Hits 54249 54277 +28
- Misses 20438 20457 +19
🚀 New features to boost your workflow:
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
The previous commit inserted AlreadyShutdown between RuntimeUnavailable and LockFailed, which silently renumbered every subsequent variant of the #[repr(C)] enum. Callers compiled against the older generated header would then misclassify error codes after upgrading the shared library (e.g. interpret AlreadyShutdown as LockFailed). Pin every discriminant explicitly and append AlreadyShutdown with value 7, restoring the original numeric values for all pre-existing variants. Add a doc note so future variants are appended with a fresh value rather than inserted. Caught by Codex on PR #1986. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
after_fork_parent/after_fork_childused?inside the worker restart loop. A singlePausableWorker::InvalidStateaborted the entire batch, silencing every other component — critical in the fork child. Now log-and-continue, mirroringbefore_fork.spawn_workeraftershutdownsilently registered a worker that would never run. Added ashutdown: AtomicBoolflag, set inshutdown/shutdown_async, checked inspawn_worker(native + wasm) under the workers lock so the during-shutdown race is also covered. NewSharedRuntimeError::AlreadyShutdownvariant + FFI mapping.runtime_handle()(API design): The method handed callers an owned tokioHandle, bypassing every fork-safety guarantee the crate exists to provide (tasks spawned via the handle aren't paused before fork, restarted after fork, or shut down). Replaced withwith_runtime_context(closure), which scopes the entered runtime to a synchronous closure. The lone production caller inlibdd-data-pipelineis migrated.🤖 Generated with Claude Code