fix: separate stderr from stdout in benchmark to prevent invalid JSON#1764
fix: separate stderr from stdout in benchmark to prevent invalid JSON#1764
Conversation
The benchmark run step used `2>&1` which mixed stderr progress logs into benchmark-results.json, making downstream jq parsing silently fail. Now stderr goes to benchmark-progress.log, and a JSON validation step catches any corruption early. Closes #1757 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR fixes the performance benchmark CI workflow so that JSON results written to stdout are not corrupted by progress/diagnostic output written to stderr, addressing issue #1757.
Changes:
- Redirect benchmark stderr to
benchmark-progress.logwhile keeping stdout asbenchmark-results.json. - Add a
jq emptyvalidation step to fail fast when the results file is not valid JSON. - Upload both the JSON results and progress log as CI artifacts; document the stdout/stderr contract in the benchmark script.
Show a summary per file
| File | Description |
|---|---|
scripts/ci/benchmark-performance.ts |
Documents the required stdout (JSON only) vs stderr (progress/diagnostics) contract relied on by CI. |
.github/workflows/performance-monitor.yml |
Separates stdout/stderr capture for the benchmark, validates JSON output, and uploads the progress log artifact. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 1
| - name: Validate benchmark JSON | ||
| run: | | ||
| if ! jq empty benchmark-results.json 2>/dev/null; then | ||
| echo "ERROR: benchmark-results.json is not valid JSON" | ||
| cat benchmark-results.json | ||
| exit 1 | ||
| fi | ||
|
|
||
| - name: Upload results | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: benchmark-results-${{ github.sha }} | ||
| path: benchmark-results.json | ||
| path: | | ||
| benchmark-results.json | ||
| benchmark-progress.log | ||
| retention-days: 90 |
There was a problem hiding this comment.
If JSON validation fails, the job exits before the artifact upload step, so benchmark-progress.log/benchmark-results.json won’t be available as CI artifacts when you most need them for debugging. Consider making the upload step run with if: always() (or move the upload before validation) so artifacts are preserved even when the validation step fails.
There was a problem hiding this comment.
Done. Moved the upload step before validation and added if: always() so artifacts are preserved even when validation fails.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Mossaka
left a comment
There was a problem hiding this comment.
Security Review - PR #1764
Reviewed the full diff and the complete performance-monitor.yml workflow plus benchmark-performance.ts.
Summary
This PR separates stderr from stdout (2>benchmark-progress.log instead of 2>&1) to prevent benchmark progress messages from corrupting the JSON output. The change is narrow and well-scoped.
Findings
No security issues found in the changes introduced by this PR.
The stderr-to-file redirect and JSON validation step are safe. The new benchmark-progress.log artifact contains only console.error output from the benchmark script (iteration counters, progress messages) -- no secrets or sensitive data.
Pre-existing observations (not introduced by this PR)
These are informational notes about the existing workflow, not blockers for this PR:
-
[Info] Actions pinned to major version, not SHA:
actions/checkout@v4,actions/setup-node@v4,actions/upload-artifact@v4, andactions/github-script@v7are pinned to major versions rather than commit SHAs. This is standard practice for first-party GitHub actions and acceptable here, though SHA pinning would be stronger for supply-chain hardening. -
[Info]
github.workspacein heredoc: Line 43 uses${{ github.workspace }}inside a heredoc written to/usr/local/bin/awf. This is safe becausegithub.workspaceis a runner-controlled path (e.g.,/home/runner/work/repo/repo) that cannot be influenced by external actors. The heredoc delimiter is quoted (<<'WRAPPER'), preventing shell expansion of the content -- only the GitHub expression is interpolated by Actions before the shell runs. -
[Info]
|| trueon benchmark step: The|| truemeans benchmark failures won't fail the step. This is intentional (regressions are handled by the separate "Check for regressions" step and issue creation), but worth noting that a completely broken benchmark silently produces invalid JSON. The new JSON validation step in this PR actually mitigates this by explicitly failing if the output is not valid JSON -- a nice improvement. -
[Info] Benchmark data in issue body: The
actions/github-scriptstep at line 93 readsbenchmark-results.jsonand interpolates it into an issue body. Since the JSON is generated by the benchmark script (not from external input) and the workflow only triggers onscheduleandworkflow_dispatch(not on PRs from forks), there is no injection vector here. -
[Info] Workflow permissions are properly scoped:
contents: readandissues: write-- minimal permissions for what the workflow does. Good.
Verdict
Clean. The changes are safe. The stderr separation and JSON validation are both improvements.
Security Review Agent
Address Copilot review: ensure artifacts are uploaded even when validation fails, so logs are available for debugging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Smoke Test Results✅ GitHub MCP: "test: add CLI proxy sidecar integration tests" / "chore: upgrade workflows to gh-aw-actions v0.67.2" Overall: PASS
|
🔥 Smoke Test Results
Overall: PASS — PR by @Mossaka
|
|
Smoke test (Codex) results
|
Smoke Test: GitHub Actions Services Connectivity ✅All checks passed:
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Summary
benchmark-progress.loginstead of mixing it intobenchmark-results.jsonvia2>&1, which was corrupting the JSON outputjq empty) after the benchmark run to fail fast if the output is not valid JSONbenchmark-progress.logalongside results as a CI artifact for debuggingbenchmark-performance.ts(stdout = JSON only, stderr = progress)Closes #1757
Test plan
npx js-yaml)npm run buildsucceedsnpm testpasses (1311/1311)🤖 Generated with Claude Code