Migrate to Runt#252
Merged
Merged
Conversation
…determinism remove all the .out files and .err fiels from turnt
serialize all runt tests
090d682 to
5316520
Compare
and remove unecessary runt targets
Collaborator
Can you point me to these tests? That seems a little sketch. |
ekiwi
reviewed
Jun 24, 2026
ekiwi
reviewed
Jun 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Runt-based snapshot testing: catalog-driven workflow
TL;DR for contributors
scripts/test_catalog.py, runpython3 scripts/generate_runt_configs.py, thenrunt -s runt/<suite>tosave the golden
.expectfile. Commit the catalog, the regeneratedrunt.toml, and the new.expect.just runt(orrunt runt/interp,runt runt/monitor,runt runt/graph_interp).runt.tomlHow it works
The catalog holds only the facts that can't be derived from the
.prot/.tx/.vfiles themselves (how a test wires up to its protocol/RTL and its expectedoutcome). Everything else — commands, expect-file names, which protocol features
a test uses — is computed by the generator.
The catalog (
scripts/test_catalog.py)TODO: Rename TX to Interp
I previously presented a thing where we had protocols, interp, and monitor stuff as first-class objects in the catalog. now just the interp and monitor stuff as first class things, because protocols are not really tests. interp and monitor cases just point to protocols, and when we need data about the protocol used (i.e. what the constructs used are), we generate that when we actually generate the runt configs later. this minimizes the amount of duplicate maintenance of tests you need to check in.
TX_CASES— interpreter / graph-interpreter cases, keyed by.txpathexpectis"pass", or a failure-class string for expected failures(
comb_dependency,assertion_mismatch,assignment_conflict,fork_protocol_error,static_type_error,static_well_formedness,max_steps).MONITOR_CASES— monitor cases, keyed by a unique idMonitors are keyed by id (not path) because many cases can share one
.prot(e.g. the antmicro cases differ only by waveform).
Programatically Generating Cases
The large antmicro family is generated from a list of trace stems via a small
helper at the bottom of the file. I reccomend you do something like this whenever we have to procedurally generate a number of cases.
The suites
Exactly three, and their union covers every test:
interpTX_CASESentrymonitorMONITOR_CASESentrygraph_interpfor/repeatloopEach suite is defined using a series of filters over the test cases. Since this is all python, you can do arbitrary filtering.
How features are detected (graph_interp selection)
The graph interpreter can't yet handle
for-in/repeatloops. Instead oftracking an allowlist, the generator asks the protocol compiler which constructs
each protocol uses:
This prints the AST-derived constructs per protocol definition. A passing tx is
included in
graph_interpiff its protocol uses nofor_in_loop/repeat_loop.This reads the real AST using an EnumDiscriminant macro, which means if you add new
Stmttypes to the AST or add new test cases, there is no maintenance overhead.Expected failures and timeouts
expect!="pass"): the runner prints its diagnosticto stdout and exits non-zero; Runt captures both the message and the exit code
in the
.expectsnapshot, so failure output is diff-tested like everythingelse.
set
timeout_secs. The generated command wraps them in a small timeout scripttimeout (kills the process group, exits 124) which we can expect in Runt.
auto-generated naming of .expect files
<test_dir>/expects/<stem>.<runner>.expect(e.g.
add_combinational.interp.expect). The<runner>keeps a.tx'sinterpandgraph_interpgoldens from colliding; monitor antmicro cases arenamed by their waveform stem.
with different commands, so collisions can't silently clobber a golden.
CI
In
.github/workflows/test.yml, thetestsjob:(
python3 scripts/generate_runt_configs.py+git diff) — so a stalerunt.tomlcan't slip through,cargo test,runt/interp,runt/graph_interp,runt/monitor.Runt is installed in CI from our fork
(
cargo install --git https://github.com/Nikil-Shyamsunder/runt.git), which addsthe custom expect file naming behavior these suites rely on.
other Repo-level organization changes
tests/(andexamples/) tree instead of being scattered per-crate, with one catalog, describing them. expect files for test families are under theexpect/directory for eachFiles that should be reviewed by hand
scripts/test_catalog.py- hand-maintained catalog (the only thing you edit to add tests)scripts/generate_runt_configs.py- generatesrunt/*/runt.tomlfrom the catalog when you want to make a new suite.github/workflows/test.yml- CI: config-freshness check + the three suitesast.rsandcli/main.rs- a few changes that allow us to print all the constructs in a.protfile