Skip to content

fix(validator): serialize hotkey contract writes to prevent nonce collisions#465

Closed
jaso0n0818 wants to merge 1 commit into
entrius:testfrom
jaso0n0818:fix/validator-write-lock-nonce-serialization
Closed

fix(validator): serialize hotkey contract writes to prevent nonce collisions#465
jaso0n0818 wants to merge 1 commit into
entrius:testfrom
jaso0n0818:fix/validator-write-lock-nonce-serialization

Conversation

@jaso0n0818

Copy link
Copy Markdown

Problem

The validator signs contract writes with the same hotkey over two separate substrate connections: the forward loop (contract_client on self.subtensor) and the axon handlers (axon_contract_client on axon_subtensor).

Both call create_signed_extrinsic, which auto-fetches the nonce via AccountNonceApi.account_nonce — the best-block nonce, which does not count pending pool txs. Within the same block window both clients fetch nonce N; one tx lands and the other is banned (error 1012 Transaction is temporarily banned), starving the forward loop and causing delivered swaps to be slashed. Closes #457.

Fix

allways/contract_client.py

Add an optional write_lock: threading.Lock parameter to AllwaysContractClient.__init__. Inside exec_contract_raw, hold the lock across nonce-fetch + submit + inclusion. The pre-flight balance read is intentionally left outside the lock so reads remain parallel.

A _NullContext no-op is used when no write_lock is passed (backward-compatible).

neurons/validator.py

Create one shared self._write_lock = threading.Lock() and pass it to both contract_client and axon_contract_client at construction. Lock order: axon_lock → write_lock (axon writes) and write_lock → substrate_lock (forward writes) — no cycle.

Tests

New tests/test_write_lock_serialization.py (6 tests):

  • Two clients sharing the same lock object store the same reference
  • Client without write_lock completes exec_contract_raw normally
  • write_lock is held when substrate_call(submit_extrinsic) runs
  • Balance read (substrate_call for account info) fires before the write_lock is acquired

Closes #457

…lisions

The validator signs contract writes with the same hotkey over two separate
substrate connections: the forward loop (contract_client / self.subtensor)
and the axon handlers (axon_contract_client / axon_subtensor). Both call
create_signed_extrinsic which auto-fetches the nonce via AccountNonceApi —
the best-block nonce, which does not count pending pool txs. When both
clients race within the same block window they fetch the same nonce N, one
tx lands and the other is banned (1012 Transaction is temporarily banned),
starving the forward loop and causing delivered swaps to be slashed.

Add an optional write_lock parameter to AllwaysContractClient.__init__.
exec_contract_raw acquires the lock across nonce-fetch + submit + inclusion,
so the best-block nonce is guaranteed to advance before the sibling client
composes its next extrinsic. The pre-flight balance read is intentionally
left outside the lock so reads remain parallel.

In neurons/validator.py, create one threading.Lock as self._write_lock and
pass it to both contract_client and axon_contract_client at construction.

Lock ordering: axon_lock -> write_lock (axon handlers) and write_lock ->
substrate_lock (forward loop). No path takes write_lock -> axon_lock, so
no deadlock cycle.

Backward compat: write_lock defaults to None; omitting it produces a
_NullContext no-op so existing call sites and tests are unaffected.

New tests/test_write_lock_serialization.py (6 tests) verifies wiring
(shared lock stored on both clients), lock held during submit, no error
without write_lock, and balance read fires before write_lock is acquired.

Closes entrius#457
@xiao-xiao-mao xiao-xiao-mao Bot added the bug Something isn't working label Jun 9, 2026
@jaso0n0818

Copy link
Copy Markdown
Author

Rebased on latest test. Serializes hotkey contract writes to prevent nonce collisions. Ready for review.

@anderdc anderdc closed this Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants