Improve Benchmarking#2798
Conversation
🛡️ AI Review — Skeptic (security review)VERDICT: VULNERABLE Baseline scrutiny: established OpenTensor-associated contributor with repo write permission and substantial subtensor history; branch apply-benchmark-to-only-bad-benches -> devnet-ready. The PR does not modify Findings
Prior-comment reconciliation
ConclusionThe PR appears legitimate, but the prior 📜 Previous run (superseded)
🔍 AI Review — Auditor (domain review)VERDICT: 👎 Gittensor UNKNOWN; author has repo write permission, @opentensor profile context, and substantial recent subtensor history, so calibrated as established-contributor domain review. PR body is substantive and matches the benchmark/tooling scope, so I would leave it unchanged. Validation performed: Findings
Prior-comment reconciliation
ConclusionBlocking on the |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
…tensor/subtensor into apply-benchmark-to-only-bad-benches
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👎 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👍 |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👎 |
|
🔄 AI review updated — Skeptic: VULNERABLE |
|
🔄 AI review updated — Skeptic: VULNERABLE |
| let hotkey: T::AccountId = account("register_limit_hot", 0, 1); | ||
|
|
||
| Subtensor::<T>::init_new_network(netuid, 1); | ||
| Subtensor::<T>::set_max_allowed_uids(netuid, 4096); |
There was a problem hiding this comment.
[HIGH] Benchmark the full-subnet registration path
register_limit now supplies the production dispatch weight, but this benchmark only initializes an empty subnet, sets MaxAllowedUids to 4096, and then registers one neuron. That measures the cheap current_subnetwork_n < max_allowed append path. In production, once the subnet is full, do_register_limit -> do_register -> register_neuron takes the prune/replace path, including replace_neuron and clear_neuron, which clears existing weights/bonds and iterates stored weight rows. A signed caller can hit that path, so the generated weight can undercharge the expensive registration case. Populate the subnet to MaxAllowedUids with a prune candidate before the extrinsic call, or otherwise benchmark the worst-case full-subnet replacement path.
|
🔄 AI review updated — Skeptic: VULNERABLE |
| let hotkey: T::AccountId = account("register_limit_hot", 0, 1); | ||
|
|
||
| Subtensor::<T>::init_new_network(netuid, 1); | ||
| Subtensor::<T>::set_max_allowed_uids(netuid, 4096); |
There was a problem hiding this comment.
[HIGH] Benchmark the full-subnet registration path
This still sets MaxAllowedUids to 4096 but does not populate the subnet before measuring register_limit. In production, do_register_limit delegates into do_register, and register_neuron only appends while SubnetworkN < MaxAllowedUids; once full, it calls get_neuron_to_prune and replace_neuron, which clears/rebuilds substantially more storage. The benchmark therefore keeps producing an append-path weight for an extrinsic that users can invoke on the heavier full-subnet prune/replace path. Seed the subnet to MaxAllowedUids with a pruneable candidate before the measured call, or benchmark the worst reachable path explicitly.
|
🔄 AI review updated — Skeptic: VULNERABLE |
Description
This PR
Modifies the benchmark flow so that only extrinsics that have drifted passed the threshold or have incorrect read/writes are updated rather than replacing the entire file and creating conflicts everywhere.
Adds a lint to check for extrinsics with missing benchmarks. This can be hidden with
#[allow(unknown_lints, benchmarked_weight_not_plugged)]Adds 48 missing benchmarks
Fixes a bug where benchmarks could fail on noisy internal component drift even when the actual benchmark-level weight was still within the configured threshold.
Adds
block_stepto the benchmark flow so theon_initializeweight is generated instead of hardcoded. The benchmark simulates the current worst case with 2 epochs in a block.Updates the drift check to include proof size and to compare the actual benchmark-level weight instead of failing on noisy internal component drift.