feat(ci): add e2e duration charts#2361
Open
universal-itengineer wants to merge 20 commits into
Open
Conversation
4576709 to
7f07b36
Compare
Add per-spec timing propagation, Chart.js report rendering, Loop file uploads, and local report artifacts so E2E failures can include duration context. Signed-off-by: Nikita Korolev <[email protected]>
- Propagate core.warning from messenger-report.js into buildThreadMessages so duration-chart rendering failures are visible in CI logs instead of being silently masked by the "Charts unavailable." placeholder. - Cache ChartJSNodeCanvas as a module-level singleton; chart rendering is reused across clusters instead of allocating a fresh canvas each time. - Sanitize cluster name when composing chart file names so unexpected characters cannot leak into uploaded Loop attachment names. - Switch e2e-matrix workflow to npm ci with cache and document why node-version stays pinned to 20 (actions/github-script@v7 ABI). - Add jest case asserting core.warning is emitted when chart rendering rejects. Signed-off-by: Nikita Korolev <[email protected]>
If the Loop bot token lacks the upload_file permission, /api/v4/files returns HTTP 403 and the per-file uploadFileToLoop call throws. The exception used to propagate out of makeThreadedReportInLoop, aborting the whole thread and dropping the failed-tests reply that we delivered before charts were introduced. Wrap the per-reply upload in try/catch, log a warning explaining that attachments are skipped, and still post the reply text. The behaviour is now strictly additive to the pre-charts implementation: chart attachments are best-effort, the failed-tests table is guaranteed. Add a jest case that mocks /api/v4/files with a 403 and asserts the thread reply is posted without file_ids. Signed-off-by: Nikita Korolev <[email protected]>
…s only Per user feedback after a local run review: the main message has to match the pre-charts main-branch layout, and the per-cluster thread reply should attach only the chart PNGs without the extra "### Test durations / Attached charts: ..." caption block. - markdown.js: drop renderTopSlowestSection (and the formatDuration/renderDurationBar helpers it owned) and the call from buildMainMessage so "Top slowest tests" no longer appears in the main message. - markdown.js: shrink renderChartCaption to "Charts unavailable." only on chart-render error and an empty string otherwise, so a cluster with successful charts gets a clean **[cluster](url)** header plus file attachments. - messenger-report.test.js: refresh the chart and Loop scenarios to assert the new minimal thread layout and the absence of the Top-3 section in the main message. Failed-tests thread reply behavior is preserved: clusters with failures still get the "### Failed tests" table, just without the chart caption block underneath it. Signed-off-by: Nikita Korolev <[email protected]>
Signed-off-by: Nikita Korolev <[email protected]>
6aac4a5 to
63d3cc7
Compare
…visualizations Replaces the four ad-hoc cluster charts (top-slowest, duration-histogram, feature-totals, status-stacked) with a curated set of five: 1. status-doughnut - passed/failed/errors/skipped counts at a glance 2. pareto-slowest - top-N slowest specs + cumulative % of suite time 3. pass-rate-per-feature - 100% stacked horizontal bar, most-broken on top 4. quantiles-per-feature - p50/p90/max duration per feature 5. feature-totals - total duration per feature Also simplifies chart-config.js: a single pass aggregator feeds every chart, shared status and palette constants live at the top of the module, and only buildClusterChartConfigs(specTimings) is exported. The renderer now drives chart names from the builder list instead of hardcoding them. Signed-off-by: Nikita Korolev <[email protected]>
Reworks the experimental five-chart set into four triage-oriented charts: feature duration by status, slowest specs, duration buckets, and failed plus slow specs. The previous doughnut, quantile chart, and pareto percentage line are removed because they added noise without improving failure diagnosis. Adds local value labels to the chart configs so generated PNGs show concrete seconds/counts without extra dependencies. Signed-off-by: Nikita Korolev <[email protected]>
Drops the failed-and-slow-specs chart because it overlaps with slowest-specs when a run has few failures. The messenger report now renders three charts per cluster: feature duration by status, slowest specs, and duration buckets. Signed-off-by: Nikita Korolev <[email protected]>
…chart Changes slowest-specs from status-filled bars to a duration bucket gradient: cyan for fast specs, blue for medium specs, and purple for slow specs. Failed and errored specs remain distinct via red/amber borders and value-label suffixes. Signed-off-by: Nikita Korolev <[email protected]>
Aligns chart titles with Ginkgo hierarchy terms, adds a drill-down legend and minute-based duration ticks to slowest-specs, and trims the chart canvas height. Duration bucket bars are made thinner so the three bucket rows read less heavy. Signed-off-by: Nikita Korolev <[email protected]>
Adds per-chart canvas sizing and renders slowest-specs at 1920x720 so long It/Entry drill-down labels fit without clipping. Other E2E report charts keep the default 1280x640 size. Signed-off-by: Nikita Korolev <[email protected]>
Widens the slowest-specs output from 1920px to 2048px so long drill-down labels and value labels have a little more space from the image edge. Signed-off-by: Nikita Korolev <[email protected]>
Draws count labels for tiny stacked duration-bucket segments as offset callouts instead of hiding them or placing overlapping text on the white background. Signed-off-by: Nikita Korolev <[email protected]>
Keep the Loop report focused on feature duration status while moving slowest-specs into local and CI artifact renderers for deeper triage. Signed-off-by: Nikita Korolev <[email protected]>
Use tmp/charts as the local and CI output directory for generated slowest-specs PNG artifacts. Signed-off-by: Nikita Korolev <[email protected]>
9935f11 to
7a20dd1
Compare
Signed-off-by: Nikita Korolev <[email protected]>
fd73274 to
d8f3f5a
Compare
diafour
reviewed
May 22, 2026
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Download E2E report artifacts | ||
| # v8 exists upstream and keeps report artifact downloads on the current action major. |
Member
There was a problem hiding this comment.
I think it is safe to remove this comment. It says nothing specific for actions/download-artifact.
Align Python chart rendering with the JS report contract, make chart output directories explicit, harden top chart rendering, and expand coverage for parser and CLI behavior. Signed-off-by: Nikita Korolev <[email protected]>
a8a71dc to
fa2486a
Compare
Signed-off-by: Nikita Korolev <[email protected]>
fa2486a to
25ac64c
Compare
Signed-off-by: Nikita Korolev <[email protected]>
Signed-off-by: Nikita Korolev <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Move E2E report chart generation from the previous JavaScript/Chart.js implementation to a Python renderer based on
matplotlib.The E2E report pipeline now renders chart PNG files in Python before the Loop messenger report is built. Python writes a manifest that maps each cluster to generated chart files, and the JavaScript messenger path only reads that manifest and attaches existing PNG files to Loop thread replies.
Main changes:
feature-duration-statuschart in Python with a stacked per-Describe duration/status layout, 60-second x-axis ticks, value labels, and compact chart margins.slowest-specscharts in Python for local triage and CI workflow artifacts.chartjs-node-canvasdependency.report:render:slowestandreport:render:top-slowestwriting PNG output undertmp/charts/.tmp/test-ci/report/run.shrunner to create a Python venv, install chart dependencies there, generate messenger chart manifests, and attach generated files locally.Why do we need it, and what problem does it solve?
E2E failures in the matrix workflow were hard to triage from the messenger summary alone. The text report showed pass/fail counts and failed tests, but did not show where the suite spent time or which feature areas were close to timeout/flakiness.
This change keeps the Loop thread focused while adding visual duration context. The messenger report shows the high-level feature duration/status chart, and deeper slowest-specs drill-downs remain available as local output and workflow artifacts.
Moving chart generation to Python also removes the native
canvasdependency from the JavaScript report path and makes the messenger JS code responsible only for message assembly and file attachment.What is the expected result?
report-to-channeljob.tmp/messenger-charts/manifest.jsonandfeature-duration-statusPNG files before the Loop report is sent.feature-duration-statusPNG attachments for clusters with report data.e2e-report-slowest-by-describeartifact with top-5 slowest Describe charts per cluster fromtmp/charts/.Checklist
Changelog entries