Skip to content

Security audit fixes (2026-06-12): QUIC key-update, DTLS spoof-DoS, XMSS^MT exhaustion, CMAC verify, + hardening#35

Merged
MagicalTux merged 25 commits into
masterfrom
security-fixes-2026-06-12
Jun 11, 2026
Merged

Security audit fixes (2026-06-12): QUIC key-update, DTLS spoof-DoS, XMSS^MT exhaustion, CMAC verify, + hardening#35
MagicalTux merged 25 commits into
masterfrom
security-fixes-2026-06-12

Conversation

@MagicalTux

Copy link
Copy Markdown
Member

Sixth security-audit pass. Eight parallel auditors scoped to disjoint module groups; fixes implemented in isolated worktrees and integrated here. Every commit passed fmt / clippy -D warnings / rustdoc -D warnings / cargo test / cargo-semver-checks individually, and the combined tree passes all of them plus default and no-default-features builds (1725 tests, 0 failures; semver: no update required).

High

  • QUIC self-initiated key-update desync + replay-window reset (29f9374). one_rtt_phase conflated the TX key-phase bit with the RX update reference, so after a locally-initiated key update a normal in-flight old-phase peer packet was misread as a peer update — breaking the connection, falsely clearing the pending-confirm flag, and resetting the PN replay window while old-phase RX keys stayed live (RFC 9001 §9.5 replay bypass). Fix splits RX phase (CryptoState::rx_phase) from TX phase; a flip commits only when a packet opens under the next-generation RX keys. Regression test fails when reverted.

Medium

  • DTLS spoofed plaintext handshake records (5289106). The "silently discard spoofable records" doctrine wasn't applied to handshake dispatch on unauthenticated epoch-0 paths, so one off-path datagram (notably a forged HelloVerifyRequest) could abort or permanently wedge an in-flight handshake. Gated dispatch faults on the authenticated flag (state-scoped so a genuine cert-mismatch rejection stays fatal), added an hvr_consumed guard and a reassembler seq-rewind so a well-framed spoof can't pin past the genuine message.
  • XMSS^MT h=40 leaf-index wrap → OTS reuse (7a74bc9). For the three h=40 sets the 40-bit index field can't hold the 2^40 exhaustion sentinel; it truncated to 0 and silently re-armed leaf 0's already-used WOTS+ key (forgery). Now sacrifices the last leaf to keep the sentinel representable; enforced in both sign and validate_raw_sk.
  • Cmac::verify empty/truncated-tag auth bypass (a9d3dc8). The inherent verify compared only expected.len() bytes; verify(b"") accepted unconditionally. Now length-strict (full 16-byte tag only).
  • TLS 1.2 / legacy unbounded handshake reassembly (0ca6e0c). 1.3 caps hs_pending at 128 KiB; the 1.2/legacy engines did not (~16 MiB pinnable pre-auth). Now share the same RecordOverflow cap.
  • pc_quic_stream_read plaintext not wiped (27e321e).

Low / hardening

QUIC ACK-range cap, retire-prior-to DCID rotation, STOP_SENDING flow-control credit; DTLS overlap-fragment conflict handling; TLS EE O(n²) extension-scan cap and post-EOED app-data rejection; ML-KEM m + LMS seed zeroization; Choice::from branchless normalization; x509 strict-DER finish() parity + typed-verify algid-consistency; FFI alloc-abort caps + ALPN validation + *consumed panic-path + LMS/XMSS NULL-out OTS-burn guard; CLI secret wipes, world-readable-key warnings, and -len output caps.

Deferred / flagged

  • FFI cipher.rs per-call stack key wipes — ~20 short-lived copies across two large match arms; clean fix needs restructuring with regression risk, left for a focused follow-up.
  • Clean (no actionable findings): RSA/EC/DH/bignum and x509/DER core verification logic.

Auditor scopes: TLS · QUIC · DTLS · x509+DER · RSA/EC/DH/bignum · PQC(ML-KEM/ML-DSA/SLH-DSA/LMS/XMSS) · symmetric+RNG+KDF+HPKE · FFI+CLI.

🤖 Generated with Claude Code

MagicalTux and others added 25 commits June 12, 2026 08:22
Track the 1-RTT rx key phase (`CryptoState::rx_phase`) separately from
the tx key phase (`one_rtt_phase`). The receive path previously used the
tx-derived `one_rtt_phase` as the reference for deciding whether an
inbound packet represented a peer key update. After a self-initiated
update (`initiate_key_update`/`flip_tx_key_phase`) the tx phase advances
while the peer keeps sending at the old phase; a normal in-flight
old-phase peer packet — which still decrypts under the unchanged
old-phase rx keys — was misread as a peer key update and ran a spurious
`commit_rx_key_phase_flip`. That stamped subsequent ciphertext with the
wrong phase bit (connection break), wrongly cleared
`tx_phase_pending_confirm`, and reset `rx_pn_window` while the same
old-phase rx keys stayed live (RFC 9001 §9.5 per-key replay bypass).

The receive path now commits a phase flip only when a packet was opened
with the NEXT-generation rx keys (pkt_phase != rx_phase && !opened_with_prev),
so a packet that decrypts under the keys already installed for the
current rx phase never triggers a commit. `flip_tx_key_phase` advances
only the tx phase; `commit_rx_key_phase_flip` advances `rx_phase`. The
tx-confirm branch now fires only when the peer's packet is at our new tx
phase (pkt_phase == one_rtt_phase), so an in-flight old-phase packet can
no longer clear the pending-confirm flag.

Adds `key_update_initiator_ignores_in_flight_old_phase_packet`, which
feeds an old-phase peer packet to the side that INITIATED the update and
asserts the tx phase, pending-confirm flag, and replay window are all
unaffected (and that the update still confirms normally afterwards).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`AckRanges::insert` created an unbounded number of disjoint ranges. An
on-path attacker who synthesizes AEAD-valid packets (e.g. Initial) with
gapped packet numbers could add one range per packet, turning the
O(n) scan-per-insert into O(n²) CPU and O(n) memory before the PN space
is discarded.

Cap the set at `MAX_ACK_RANGES` (32) disjoint ranges. The ranges are
stored descending, so when the cap is exceeded the lowest/oldest range
(smallest PNs, least useful to keep ACKing) is evicted; a brand-new
range below all existing ones at capacity is dropped outright. Dropping
an ACK range is always safe — ACKs are informational and the peer
retransmits as needed. ACK encoding (`build_ack_ranges_raw`) iterates
the now-bounded set, so the encoded frame is bounded too.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`note_retire_prior_to` removed every entry below the new
`retire_prior_to` unconditionally — including the active outbound DCID
(typically seq 0) that `endpoint.cids.peer` still writes into every
short header — with no rotation, leaving the connection sending a CID
the peer has asked us to retire.

When `retire_prior_to` covers the active CID, rotate `active_seq` to the
lowest surviving entry whose sequence is >= the new value before
removing the old ones. If no such replacement exists yet (the peer
advanced `retire_prior_to` past every CID it has issued), RETAIN the
active entry so a usable outbound DCID survives until a replacement
NEW_CONNECTION_ID arrives — degrading gracefully rather than breaking
the connection. RFC 9000 §5.1.2-consistent.

The NEW_CONNECTION_ID handler now re-syncs `endpoint.cids.peer` to the
pool's (possibly rotated) active CID after processing the frame, so the
outbound DCID always tracks the active entry.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ENDING

When STOP_SENDING has been sent on a stream, `RecvStream::on_data`
silently drops subsequently-received payload (returns Ok(0)). But
`Streams::on_stream` had already charged those high-water bytes to
`conn_recv_used`; they were never added to `conn_consumed`, so
`maybe_replenish_conn` never re-granted them. A peer that keeps sending
after STOP_SENDING without ever resetting could thereby exhaust the
connection-level flow-control window.

Mirror `on_reset`: count the discarded post-STOP_SENDING bytes as
consumed so connection-level credit is returned. A new per-stream
`conn_fc_credited` counter tracks the bytes already credited (reads +
discards) so a later RESET_STREAM credits only the not-yet-credited
remainder — no double-count, no leak. The reset path now anchors its
discard credit on `conn_fc_credited` instead of `read_off` (equal when
nothing was discarded out-of-band).

Adds tests for the post-STOP_SENDING flood (credit returned per byte,
retransmits not double-counted) and for a following RESET_STREAM with a
larger final size.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
MEDIUM — an off-path attacker could abort or derail an in-flight DTLS
handshake with a single spoofed epoch-0 plaintext handshake record. The
`authenticated` flag only silenced `read_fragment` framing errors; the
subsequent `reassembler.feed` / `dispatch_one` / `pop_ready` calls used
`?` and propagated fatally regardless of authentication.

Fix, mirroring the DTLS 1.3 server pre-cookie wrapper (errors on the
unauthenticated path become silent drops, only the local
`InappropriateState` misconfig stays fatal):

- client12 (`WaitServerHelloOrHvr`/`WaitServerHello`): a spoofed
  ServerHello that fails handshake-layer validation is now silently
  dropped. The silent-drop is scoped to the pre-ServerHello states so a
  genuine post-SH plaintext flight fault (e.g. a certificate that does
  not match the requested host) stays fatal.
- client12 HVR path: a malformed/unexpected HVR is a silent drop, and an
  HVR that arrives once one has already been consumed is ignored without
  mutating transcript/state/reassembler (new `hvr_consumed` flag). All
  HVR-typed fragments are captured here so a spoof never enters the
  reassembler.
- client13 (`WaitServerHello`): spoofed ServerHello/HRR faults become
  silent drops on the unauthenticated path.
- server12 (`WaitClientFlight`): a spoofed plaintext ClientKeyExchange
  with a bad EC point is silently dropped; the encrypted client Finished
  (epoch >= 1) keeps every error fatal.

`feed`/`pop_ready` advance `expected_msg_seq` before dispatch, so on a
silent drop the reassembler is also rewound (new
`rewind_expected_msg_seq`): otherwise a spoofed but well-framed
ServerHello would pin `expected_msg_seq` past the genuine SH (same
message_seq), which would then be rejected as stale — a derail.

The authenticated (epoch >= 1) path is unchanged: every error stays
fatal. Adds tests forging a plaintext HVR / ServerHello at the DTLS 1.2
and 1.3 clients mid-handshake and asserting the handshake still
completes with the genuine peer.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…bytes

LOW (hardening) — pre-cookie ClientHello reassembly used "first writer
wins" for overlapping fragment bytes, so a spoofed fragment that arrived
first could pin corrupt content and cause the genuine overlapping bytes
to be silently discarded, wedging the CH decode (DoS-only).

`feed` now pre-scans the incoming fragment: if any byte it covers was
already received but DISAGREES with the buffered content, the whole
fragment is dropped without mutating state. Idempotent duplicate
fragments (identical overlapping bytes) still agree and remain accepted,
so legitimate retransmission/overlap is unaffected. Adds a bounds
pre-check so no bytes are written if any index is out of range.

Adds tests: a conflicting spoof over already-held genuine bytes is
dropped (genuine content survives and completes the message), and an
agreeing idempotent overlap is still accepted.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The TLS 1.3 core caps its handshake-message reassembly buffer at
MAX_HANDSHAKE_REASSEMBLY (128 KiB) via append_handshake_bytes, but the
TLS 1.2 and legacy (SSLv3/1.0/1.1) engines reassembled into their own
hs_pending with a bare extend_from_slice and no ceiling. pop_handshake
only drains once a complete `4 + u24_len` message is present, so a peer
could send a handshake header claiming up to ~16 MiB and slow-drip
fragments that never complete, pinning ~16 MiB per connection
pre-authentication (memory DoS).

Gate both hs_pending appends (the decrypted-plain and plaintext-fragment
paths) in each engine's next_message behind a shared append_handshake_bytes
helper that returns RecordOverflow once the buffer would exceed the
existing MAX_HANDSHAKE_REASSEMBLY constant (now reused from common.rs
rather than redefined). The legacy engine shares these code paths, so it
is covered by the same fix. Adds a test per engine that a giant
length-claim plus dribbled fragments is rejected with RecordOverflow.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
on_encrypted_extensions walked the EE extensions by hand with a
`seen: Vec<u16>` + `seen.contains(&ty)` duplicate check per extension and
no count cap, while the shared codec parse_extensions caps at
MAX_EXTENSIONS = 64 precisely to keep that linear scan from going
quadratic. The EE body is bounded only by the 128 KiB reassembly cap, so
a server could stuff ~32k minimal extensions into it and force ~5e8
comparisons (O(n^2)) on the client.

Add the same MAX_EXTENSIONS ceiling to the manual EE loop (reusing the
existing constant, now pub(crate) and re-exported from the codec module),
rejecting the over-limit case with Decode before the duplicate scan runs.
Duplicate detection and per-extension semantics are unchanged. Adds a
test that exactly MAX_EXTENSIONS parses while one more is rejected.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The server's pre-Connected ApplicationData accept gate keyed off the
sticky early_data_accepted flag:

    if self.state != State::Connected && !self.early_data_accepted { ... }

After EndOfEarlyData rotates the read key onto the client-handshake key
and sets early_data_remaining = None, early_data_accepted stays true, so
an ApplicationData record in WaitClientFinished passed the guard and the
max_early_data_size budget check was skipped (remaining is None) —
plaintext landed in the regular app_in before the client Finished was
verified. Reachable only by the legitimate client (it needs the
client-handshake key), so this is state-machine laxness rather than an
attacker forge, but it is fixed regardless.

Gate the pre-Connected accept on the live early-data signal
(early_data_remaining.is_some(), set when 0-RTT is accepted and cleared
at EndOfEarlyData) instead of the sticky flag, and tear the connection
down with a fatal unexpected_message alert on violation (matching the
adjacent budget-overflow branch). Genuine 0-RTT early data before EOED is
still accepted and budgeted. Adds a loopback test that forges an
ApplicationData record under the client-handshake key after EOED and
confirms the server rejects it.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The three h=40 XMSS^MT sets (Sha2_40_2/4/8_256) carry index_bytes =
ceil(40/8) = 5, a 40-bit index field. The post-increment exhaustion
sentinel 2^40 needs 41 bits, so storing it via idx_to_bytes truncated to
0: index() then read 0, remaining() reported a full key, and the next
sign() re-signed leaf 0 whose WOTS+ one-time key was already spent —
one-time-key reuse and forgery. The wrapped state also survived
to_bytes()/from_bytes() and passed validate_raw_sk.

Treat the key as exhausted one leaf early whenever index_bytes*8 ==
full_height: the highest signable index becomes 2^h - 2, leaving the
all-ones 2^h - 1 as the representable exhausted sentinel (matching the
xmss reference). Centralized in Params::exhausted_index(), applied in
both XmssMtPrivateKey::sign and validate_raw_sk so a wrapped/edge state
cannot be reloaded as usable. Wire format (ceil(h/8)) is unchanged.

Added xmssmt_h40_index_does_not_wrap_to_zero, which injects a
near-exhaustion raw SK on Sha2_40_8_256 (cheap d=8/tree_height=5 keygen),
signs the last permitted leaf, and asserts the next sign returns
KeyExhausted and never wraps the index back to 0.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ML-KEM encapsulate filled a stack m: [u8; 32] from the RNG, passed it to
encapsulate_deterministic, but never wiped it. m (with the public H(ek))
fully determines the shared secret. Wipe it with a 0-fill routed through
core::hint::black_box after encaps returns, matching the d/z wipe in
generate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
LmsPrivateKey::generate filled local i_id/seed arrays, passed them by
reference to from_seed, then dropped them unwiped. HssPrivateKey::generate
accumulated each level's (i_id, seed) in a heap Vec that freed with the
master seeds still resident. Each seed reconstructs that level's signing
capability — sibling modules (slhdsa/xmss/mlkem/mldsa) all wipe the
equivalent material.

Wipe i_id/seed in LmsPrivateKey::generate before returning, and zero each
HSS level tuple's i_id/seed in place after from_levels copies them and
before the Vec drops, using the module's existing wipe() (0-fill +
black_box) idiom.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…/empty tags)

The inherent `Cmac::verify` compared only `expected.len()` bytes of the
recomputed tag. An empty `expected` made `tag[..0].ct_eq(&[])` return true,
turning `cmac.verify(b"")` into an unconditional accept; any short length `n`
dropped forgery resistance to 2^(8n).

Require a full 16-byte tag (`expected.len() != 16` is rejected) before the
constant-time comparison, matching the already-strict `Mac`-trait path
(OUTPUT_LEN = Some(16)). The full-length comparison stays constant time.

No in-tree caller relied on truncation. Updated `verify_roundtrip` to assert
that truncated (`tag[..8]`) and empty (`b""`) tags now return false while the
full 16-byte tag still verifies.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`From<u8> for Choice` masked `value & 1` in release builds, so a full
0xFF/0xFE mask byte (the natural output of constant-time mask computations)
was silently truncated to its low bit — e.g. 0xFE became Choice(0).

Normalize with `!= 0` semantics so any nonzero byte maps to Choice(1) and
only 0 maps to Choice(0), using the branchless byte-nonzero trick
`((x | -x) >> 7) & 1` to keep it constant time (no branch on the secret byte).
The debug_assert for clean 0/1 inputs is kept.

All current call sites feed clean 0/1 bits, so behavior is unchanged for them
(verified by grep). Added a test exercising the normalization formula across
all 256 byte values.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The scratch `tmp: Vec<u8>` holding decrypted QUIC stream app data was
copied to the caller and dropped without scrubbing, unlike every sibling
path (pc_tls_recv, pc_quic_recv_datagram, pc_aead_decrypt). Scrub it via
wipe_vec after the copy_nonoverlapping and on the read-error early
return.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pc_tls_cfg_set_alpn, pc_quic_cfg_set_alpn and pc_csr_create_rsa called
Vec::with_capacity(n) on the raw caller-supplied count before validating
anything; a garbage `n` drives handle_alloc_error, which ABORTS the
process — the panic guard only catches unwinds. Cap the count first
(ALPN <= 256, SAN <= 1024) and return a clean code (Unsupported / NULL)
instead of reserving.

Additionally validate ALPN protocol names are 1..=255 bytes at set
time (RFC 7301): empty is illegal and >255 later trips a release
assert in the ClientHello encoder. Both reject cleanly now.

Also fold pc_tls_feed's documented `*consumed` contract onto the panic
path: initialise *consumed = 0 before touching the engine, so an unwind
out of conn.feed (caught by the guard as Internal) can no longer leave
the caller reading an uninitialised size_t.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pc_lms_sign, pc_hss_sign, pc_xmss_sign and pc_xmssmt_sign only checked
*out_len in the pre-sign capacity gate. A caller passing out == NULL
with a large *out_len passed the gate, sign() irreversibly advanced the
OTS index, and only then did out_write return NullPointer — silently
consuming a one-time leaf with no signature returned. Reject a NULL
output buffer (the expected length is always > 0) BEFORE calling sign(),
so no state is burned on a null-output call.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
kex: parse_sec1_scalar / parse_raw_or_hex_32/56 left the decoded EC
private scalar (and intermediate hex-decoded copies) unwiped. Scrub the
scalar after the key takes its own copy, and the hex-decoded Vec inside
the parse helpers.

s_server / quic_cli: the 32-byte DTLS cookie secret and QUIC retry
secret are passed to the config by value (Copy), leaving the local stack
copy live. Wipe it after last use, matching the FFI fix to pc_quic_new
(commit 316e8a2).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pkey `-in key.pem` and pkeyutl `verify -inkey` (raw LMS/HSS/XMSS and
SM2/SEC1 private key files) read key material with no permission check,
while every other secret-key reader warns. Route both through
warn_if_world_readable_key before reading.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hash (ascon-xof128/cxof128 -len) and kdf (hkdf/pbkdf2/scrypt/argon2/kbkdf
-len) did vec![0u8; len] with no bound; a garbage -len OOM-aborts.
Apply a 1 GiB cap mirroring the `rand` subcommand and die cleanly on
oversize. For HKDF additionally bound -len to the RFC 5869 255*HashLen
ceiling so an over-long request is a clean error instead of the raw
hkdf_expand assert panic.

enc: reject an empty -nonce before touching the cipher (AES-GCM would
otherwise accept it; the FFI already rejects it).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`parse_ip_addresses` lacked the `reader.finish()?` that its dNSName sibling
`parse_dns_names` uses to reject bytes after the SAN GeneralNames SEQUENCE.
Reads were already bounded by the SAN sub-reader and the SAN sits inside the
signed TBSCertificate, so this was a strict-DER consistency gap rather than
an exploit. Add the matching finish() for parity; legitimate single- and
multi-IP SANs still parse (covered by san_ip_parser_accepts_v4_and_v6).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The legacy RSA-only `Certificate::verify_signature::<LIMBS>` entry point
skipped the RFC 5280 §4.1.1.2 inner/outer AlgorithmIdentifier consistency
check that `verify_signature_with` performs. It is not exploitable here —
acceptance is hardcoded to sha256WithRSAEncryption and verification always
uses SHA-256, so a tampered outer algid yields UnsupportedAlgorithm without
changing the verified bytes — but the two verify paths now behave uniformly,
giving external callers of this typed API the same defense in depth.

Public signature unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…and SingleResponse

ResponseData (`responses()`) and SingleResponse (`read_single_response`)
previously tolerated trailing bytes after their SEQUENCEs. Both are covered
by the BasicOCSPResponse signature so this was not malleable, but the gap
diverged from the strict-DER finish() discipline used elsewhere in the
parser.

Skip the optional responseExtensions [1] / singleExtensions [1] (which we
do not surface) before calling finish(), so responses that legitimately
carry those fields still parse while genuine trailing garbage is rejected.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Formatting drift from the FFI/CLI security-fix worktree (rustfmt version
skew); reformatted with the canonical toolchain so cargo fmt --check passes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two CLI guard blocks from the FFI/CLI security-fix worktree tripped
clippy::collapsible_if under the canonical toolchain (1.96.0); rewritten
as let-chains. No behavior change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@MagicalTux MagicalTux merged commit 061971a into master Jun 11, 2026
8 checks passed
@MagicalTux MagicalTux deleted the security-fixes-2026-06-12 branch June 11, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant