Skip to content

Replace push-based compilation with rock demand-driven query system#12

Open
kozak wants to merge 11 commits intorestaumaticfrom
the-rock
Open

Replace push-based compilation with rock demand-driven query system#12
kozak wants to merge 11 commits intorestaumaticfrom
the-rock

Conversation

@kozak
Copy link
Copy Markdown

@kozak kozak commented Apr 12, 2026

Summary

Replaces the push-based BuildPlan compilation model with a demand-driven, query-based system using the rock library. This is similar to Rust's query system (salsa) and provides a foundation for IDE integration and finer-grained incrementality.

Architecture

  • Query GADT (Make/Query.hs): Defines compilation phases as typed queries — InputModule, SortedModules, ModuleGraph, ModuleSugarEnv, ModuleTypeEnv, CompileModule
  • Rock rules (Make/Rules.hs): Implements each query. Rock handles memoization, dependency tracking, and concurrent evaluation automatically
  • Graph caching (Make/Traces.hs): Persists the module dependency graph to disk between builds, storing only direct deps (~689KB vs 12MB for transitive). Validated via CacheDb content hash
  • Fast path (Make.hs): make_ (used by CLI) checks all module timestamps before entering rock. When nothing changed, skips the entire pipeline

Key design decisions

  • make/make_ signatures changed from polymorphic forall m to Make-specific (rock requires IO for memoization MVars)
  • Parallel compilation via forConcurrently with rock's memoise handling synchronization — if module B depends on A, B's thread blocks on A's MVar until A completes
  • ExternsDiff checked before timestamp fallback: a comment-only change to a root dependency skips all downstream modules (externs interface unchanged)
  • Deferred externs loading: checkModuleCache doesn't read .cbor files; externs loaded only when needed for compilation
  • sortModules Direct instead of Transitive — transitive closure computed on demand from direct deps

Performance (pr-admin, 1758 modules, optimized build)

Both binaries built with stack build (same -O1 optimization), each tested against its own clean output directory.

Scenario Baseline This PR Delta
Full build 72.2s 75.8s +5%
No changes 1.2s 1.1s -10%
Touch leaf (timestamp only) 1.2s 1.0s -17%
Comment change to Restaumatic.Prelude (1342 deps) 4.6s 2.5s -46%

The full build is ~5% slower (rock framework overhead). The incremental cases are equal or faster — the big win is the comment-only change to a root dependency where ExternsDiff correctly skips all 1342 downstream modules.

New files

  • src/Language/PureScript/Make/Query.hs — Query GADT with GEq, GCompare, Hashable instances
  • src/Language/PureScript/Make/Rules.hs — Rock rules, liftMake bridge, cache checking, transitiveClosure
  • src/Language/PureScript/Make/Traces.hsCachedGraph persistence (JSON with content-hash validation)

Test plan

  • All 1340 compiler tests pass (stack test)
  • Full build produces correct output on pr-admin
  • Incremental rebuild correctly skips unchanged modules
  • Comment-only change to root dep skips all dependents (ExternsDiff)
  • Separate rebuild of a module correctly triggers downstream recompilation
  • Graph cache correctly invalidated when module set or content changes
  • Graph cache correctly rejected for test builds (RebuildAlways/RebuildNever)

🤖 Generated with Claude Code

kozak and others added 11 commits April 10, 2026 11:03
Integrate the rock library for demand-driven incremental compilation,
replacing the previous forkIO + MVar concurrent build system.

New modules:
- Make.Query: GADT defining compilation queries (InputModule, ModuleGraph,
  SortedModules, ModuleSugarEnv, ModuleTypeEnv, CompileModule)
- Make.Rules: Rock rules mapping each query to its computation, with
  liftMake bridge between Make monad and rock's IO-based Task

The rock pipeline automatically tracks dependencies via fetch calls and
memoizes results within a build. This lays the foundation for cross-build
incrementality via rock's verifyTraces (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integrate the existing CacheDb (source hash checking) and ExternsDiff
(externs change analysis) systems into the rock query pipeline to
support cross-build incremental compilation.

When a module's source is unchanged:
- Load cached externs from disk instead of recompiling
- Use ExternsDiff to determine if dependency changes affect this module
- Skip recompilation when dependency externs changes don't impact imports

On build failure, remove only the failed modules from CacheDb so the
next build correctly recompiles them while preserving cache for
unaffected modules.

48/51 make tests pass. 2 remaining edge cases:
- Docs target freshness check (docs.json outdated detection)
- Separately-rebuilt module detection (IDE rebuild scenario)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix docs generation: pass module with Prim import to Docs.convertModule
  (was passing original module without importPrim, causing "Unknown type"
  errors for Prim types like Int)
- Detect separately-rebuilt dependencies via output timestamp comparison:
  if a dependency's output is newer than the current module's, trigger
  recompilation even if source is unchanged
- Handle RebuildNever modules correctly: use epoch timestamp to avoid
  forcing bottom values from test MakeActions

All 51 make tests and 829 compiler tests now pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fork all module compilations concurrently using async's forConcurrently.
Rock's memoise handles synchronization automatically - when module B
depends on A, B's thread blocks on A's MVar until A completes.

Performance on pr-admin (full rebuild):
- Baseline (old make):  72s wall,  678s CPU, 968% utilization
- Rock parallel:       117s wall, 1444s CPU, 1304% utilization

Wall clock is 1.6x slower due to CPU overhead from memoization
contention (many threads competing on atomicModifyIORef). The actual
compilation work is the same; the overhead is in rock's coordination
layer. This can be optimized further by batching modules by dependency
depth or reducing memoization overhead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Profile-driven optimization addressing two hotspots:

1. Control.Monad.Logger bind overhead (21% of CPU): each liftMake call
   created a fresh runMake/Logger context. Fixed by merging sugar Env
   construction into the CompileModule liftMake call, halving the number
   of runMake invocations per module.

2. resolveImport/resolveExports redundancy (22% of CPU): each module
   rebuilt its sugar Env from scratch via foldM externsEnv. Fixed by
   using a shared IORef Env that accumulates incrementally — each module
   only processes deps not already in the env.

Performance on pr-admin full rebuild:
  Baseline:         72s wall,  678s CPU
  Rock before fix: 115s wall, 1447s CPU
  Rock after fix:   76s wall,  688s CPU  (matches baseline)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…heck

- Don't eagerly load externs for ALL modules in computeCacheStatus.
  Only load for modules that are up-to-date (need their cached externs)
  or changed (need old externs for ExternsDiff).
- Skip cache checking entirely when CacheDb is empty (fresh build).
- Parallelize cache status computation with forConcurrently.

Performance on pr-admin (~1200 modules):
  Full rebuild:    69s (baseline: 72s)
  No changes:       3s (baseline: 0.5s)
  Touch one leaf:   3s (baseline: 0.6s)

Incremental overhead is from hash-checking all 1200 modules + sortModules.
Full rebuild is now slightly faster than baseline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of eagerly checking all 1200 modules' cache status before rock
starts, check each module lazily inside CompileModule when rock demands
it. This is the key architectural win of demand-driven computation:
only modules that are actually needed get their cache checked.

- Delete computeCacheStatus and computeNewCacheDb from Make.hs
- Add checkModuleCache inside CompileModule rule (runs in IO directly)
- CacheDb and timestamps accumulated via IORefs during the build
- Fix separately-rebuilt dep detection via stored output timestamps

Performance on pr-admin (~1200 modules):
  Full rebuild:    73s (baseline: 72s)  — matches
  No changes:     2.6s (baseline: 0.5s) — remaining cost is sortModules
  Touch 1 leaf:   2.7s (baseline: 0.6s)
All 51 make tests + 829 compiler tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add framework for persisting the module dependency graph to disk:
- New Traces module with CachedGraph type and JSON serialization
- getOutputDir field added to MakeActions for locating cache files
- SortedModules/ModuleGraph rules can use cached graph when available
- Graph is validated against module set and CacheDb content hashes

Graph caching is currently disabled (always passes Nothing) because
it causes 2 compiler test failures when the cached graph from a
beforeAll hook interferes with individual test cases. The module set
validation catches most cases but not all.

When enabled, graph caching eliminates the ~2s sortModules overhead
on no-changes builds. This is the remaining gap to baseline perf.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix graph caching test failures caused by empty CacheDb for
RebuildAlways/RebuildNever modules allowing false cache hits.
Validation now requires CacheDb entries for all current modules.

Store direct (not transitive) deps in cached graph, reducing
file size from 12MB to 689KB. Compute transitive closure on load.

Store a hash of the CacheDb instead of the full CacheDb in the
graph file for compact validation.

Add fast path in make_: when cached graph is valid and all module
timestamps match, skip the rock pipeline entirely. This brings
no-change rebuild from ~3.5s to ~1.2s (1758 modules).

Defer externs loading: checkModuleCache no longer reads .cbor files
for cached modules. Externs are loaded only when actually needed
for compilation or to return results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, when a dependency had a newer output timestamp, the
compiler would unconditionally recompile all dependents regardless
of whether the dependency's interface actually changed. Now it
checks ExternsDiff first, and only falls back to timestamp-based
recompilation when a dep was rebuilt externally (in a separate
build, not tracked by this build's ExternsDiff).

Also removes updateSharedEnv from the skip path since doCompile
already handles missing env entries when compilation is needed.

Tracks which modules were compiled (vs skipped) in this build
via compiledRef IORef to distinguish between empty diffs from
this-build skips vs previous-build external rebuilds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace local ../rock path with github: ollef/rock git reference
  so CI can resolve the dependency
- Fix hlint warnings: redundant brackets in Query.hs, use traverse_
  and const in Rules.hs
- Remove unused liftMake export from Rules module

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant