Skip to content

Migrate to Runt#252

Merged
Nikil-Shyamsunder merged 13 commits into
mainfrom
runt
Jun 24, 2026
Merged

Migrate to Runt#252
Nikil-Shyamsunder merged 13 commits into
mainfrom
runt

Conversation

@Nikil-Shyamsunder

@Nikil-Shyamsunder Nikil-Shyamsunder commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Runt-based snapshot testing: catalog-driven workflow

TL;DR for contributors

  • Add/edit a test: edit scripts/test_catalog.py, run
    python3 scripts/generate_runt_configs.py, then runt -s runt/<suite> to
    save the golden .expect file. Commit the catalog, the regenerated
    runt.toml, and the new .expect.
  • Run tests: just runt (or runt runt/interp, runt runt/monitor,
    runt runt/graph_interp).
  • never hand-write runt.toml

How it works

scripts/test_catalog.py -> scripts/generate_runt_configs.py
-> runt/<suite>/runt.toml -> runt  
->  <test_dir>/expects/<stem>.<runner>.expect 

The catalog holds only the facts that can't be derived from the .prot/.tx/
.v files themselves (how a test wires up to its protocol/RTL and its expected
outcome). Everything else — commands, expect-file names, which protocol features
a test uses — is computed by the generator.

The catalog (scripts/test_catalog.py)

TODO: Rename TX to Interp
I previously presented a thing where we had protocols, interp, and monitor stuff as first-class objects in the catalog. now just the interp and monitor stuff as first class things, because protocols are not really tests. interp and monitor cases just point to protocols, and when we need data about the protocol used (i.e. what the constructs used are), we generate that when we actually generate the runt configs later. this minimizes the amount of duplicate maintenance of tests you need to check in.

TX_CASES — interpreter / graph-interpreter cases, keyed by .tx path

"tests/adders/adder_d0/add_combinational.tx": {
    "protocol": "tests/adders/adder_d0/add_d0.prot",   # the .prot it runs against
    "verilog": ("tests/adders/adder_d0/add_d0.v",),     # RTL (optional)
    "top": "picorv32_pcpi_mul",                          # top module (optional)
    "expect": "pass",                                    # "pass" or a failure class
    # "max_steps": 8,        (optional)
    # "extra_args": ("--skip-static-step-fork-checks",),  (optional)
},

expect is "pass", or a failure-class string for expected failures
(comb_dependency, assertion_mismatch, assignment_conflict,
fork_protocol_error, static_type_error, static_well_formedness,
max_steps).

MONITOR_CASES — monitor cases, keyed by a unique id

Monitors are keyed by id (not path) because many cases can share one .prot
(e.g. the antmicro cases differ only by waveform).

"tests.wishbone.wishbone.monitor": {
    "protocol": "tests/wishbone/wishbone.monitor.prot",
    "wave": "tests/wishbone/reqwalker.vcd",
    "instances": ("TOP.reqwalker:WBSubordinate",),
    "expect": "pass",
    "extra_args": ("--sample-posedge", "TOP.reqwalker.i_clk"),
    # "max_steps" / "timeout_secs": (optional)
},

Programatically Generating Cases

The large antmicro family is generated from a list of trace stems via a small
helper at the bottom of the file. I reccomend you do something like this whenever we have to procedurally generate a number of cases.

The suites

Exactly three, and their union covers every test:

suite runner contents
interp interpreter every TX_CASES entry
monitor monitor every MONITOR_CASES entry
graph_interp graph interpreter the subset of passing tx whose protocol has no for/repeat loop

Each suite is defined using a series of filters over the test cases. Since this is all python, you can do arbitrary filtering.

How features are detected (graph_interp selection)

The graph interpreter can't yet handle for-in/repeat loops. Instead of
tracking an allowlist, the generator asks the protocol compiler which constructs
each protocol uses:

cargo run --bin protocols-cli -- -p <file>.prot constructs

This prints the AST-derived constructs per protocol definition. A passing tx is
included in graph_interp iff its protocol uses no for_in_loop/repeat_loop.
This reads the real AST using an EnumDiscriminant macro, which means if you add new Stmt types to the AST or add new test cases, there is no maintenance overhead.

Expected failures and timeouts

  • Expected failures (expect != "pass"): the runner prints its diagnostic
    to stdout and exits non-zero; Runt captures both the message and the exit code
    in the .expect snapshot, so failure output is diff-tested like everything
    else.
  • Expected timeouts: unfortunately runt can not handle an expected timeout. a few monitor cases are non-terminating by design and
    set timeout_secs. The generated command wraps them in a small timeout script
    timeout (kills the process group, exits 124) which we can expect in Runt.

auto-generated naming of .expect files

  • Expect file: <test_dir>/expects/<stem>.<runner>.expect
    (e.g. add_combinational.interp.expect). The <runner> keeps a .tx's
    interp and graph_interp goldens from colliding; monitor antmicro cases are
    named by their waveform stem.
  • The generator errors out if two cases would ever map to the same expect file
    with different commands, so collisions can't silently clobber a golden.

CI

In .github/workflows/test.yml, the tests job:

  1. builds the workspace,
  2. regenerates the configs and fails if they differ from what's checked in
    (python3 scripts/generate_runt_configs.py + git diff) — so a stale
    runt.toml can't slip through,
  3. runs cargo test,
  4. runs runt/interp, runt/graph_interp, runt/monitor.

Runt is installed in CI from our fork
(cargo install --git https://github.com/Nikil-Shyamsunder/runt.git), which adds
the custom expect file naming behavior these suites rely on.

other Repo-level organization changes

  • All snapshot tests now live under a common tests/ (and examples/) tree instead of being scattered per-crate, with one catalog, describing them. expect files for test families are under the expect/ directory for each
  • all turnt stuff is completely deleted

Files that should be reviewed by hand

scripts/test_catalog.py - hand-maintained catalog (the only thing you edit to add tests)

scripts/generate_runt_configs.py - generates runt/*/runt.toml from the catalog when you want to make a new suite

.github/workflows/test.yml - CI: config-freshness check + the three suites

ast.rs and cli/main.rs - a few changes that allow us to print all the constructs in a .prot file

@Nikil-Shyamsunder Nikil-Shyamsunder changed the title Monstor Runt Migration Monster Runt Migration Jun 23, 2026
serialize all runt tests
@Nikil-Shyamsunder Nikil-Shyamsunder force-pushed the runt branch 3 times, most recently from 090d682 to 5316520 Compare June 23, 2026 21:19
and remove unecessary runt targets
@Nikil-Shyamsunder Nikil-Shyamsunder marked this pull request as ready for review June 23, 2026 22:17
@ekiwi

ekiwi commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator
  • unfortunately runt can not handle an expected timeout. a few monitor cases are non-terminating by design

Can you point me to these tests? That seems a little sketch.

Comment thread .github/workflows/test.yml Outdated
Comment thread scripts/roundtrip_case.py Outdated
@Nikil-Shyamsunder Nikil-Shyamsunder changed the title Monster Runt Migration Migrate to Runt Jun 24, 2026
@Nikil-Shyamsunder Nikil-Shyamsunder merged commit 0c9ac13 into main Jun 24, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants