Security audit fixes (2026-06-12): QUIC key-update, DTLS spoof-DoS, XMSS^MT exhaustion, CMAC verify, + hardening by MagicalTux · Pull Request #35 · KarpelesLab/purecrypto

MagicalTux · 2026-06-11T23:28:18Z

Sixth security-audit pass. Eight parallel auditors scoped to disjoint module groups; fixes implemented in isolated worktrees and integrated here. Every commit passed fmt / clippy -D warnings / rustdoc -D warnings / cargo test / cargo-semver-checks individually, and the combined tree passes all of them plus default and no-default-features builds (1725 tests, 0 failures; semver: no update required).

High

QUIC self-initiated key-update desync + replay-window reset (29f9374). one_rtt_phase conflated the TX key-phase bit with the RX update reference, so after a locally-initiated key update a normal in-flight old-phase peer packet was misread as a peer update — breaking the connection, falsely clearing the pending-confirm flag, and resetting the PN replay window while old-phase RX keys stayed live (RFC 9001 §9.5 replay bypass). Fix splits RX phase (CryptoState::rx_phase) from TX phase; a flip commits only when a packet opens under the next-generation RX keys. Regression test fails when reverted.

Medium

DTLS spoofed plaintext handshake records (5289106). The "silently discard spoofable records" doctrine wasn't applied to handshake dispatch on unauthenticated epoch-0 paths, so one off-path datagram (notably a forged HelloVerifyRequest) could abort or permanently wedge an in-flight handshake. Gated dispatch faults on the authenticated flag (state-scoped so a genuine cert-mismatch rejection stays fatal), added an hvr_consumed guard and a reassembler seq-rewind so a well-framed spoof can't pin past the genuine message.
XMSS^MT h=40 leaf-index wrap → OTS reuse (7a74bc9). For the three h=40 sets the 40-bit index field can't hold the 2^40 exhaustion sentinel; it truncated to 0 and silently re-armed leaf 0's already-used WOTS+ key (forgery). Now sacrifices the last leaf to keep the sentinel representable; enforced in both sign and validate_raw_sk.
Cmac::verify empty/truncated-tag auth bypass (a9d3dc8). The inherent verify compared only expected.len() bytes; verify(b"") accepted unconditionally. Now length-strict (full 16-byte tag only).
TLS 1.2 / legacy unbounded handshake reassembly (0ca6e0c). 1.3 caps hs_pending at 128 KiB; the 1.2/legacy engines did not (~16 MiB pinnable pre-auth). Now share the same RecordOverflow cap.
pc_quic_stream_read plaintext not wiped (27e321e).

Low / hardening

QUIC ACK-range cap, retire-prior-to DCID rotation, STOP_SENDING flow-control credit; DTLS overlap-fragment conflict handling; TLS EE O(n²) extension-scan cap and post-EOED app-data rejection; ML-KEM m + LMS seed zeroization; Choice::from branchless normalization; x509 strict-DER finish() parity + typed-verify algid-consistency; FFI alloc-abort caps + ALPN validation + *consumed panic-path + LMS/XMSS NULL-out OTS-burn guard; CLI secret wipes, world-readable-key warnings, and -len output caps.

Deferred / flagged

FFI cipher.rs per-call stack key wipes — ~20 short-lived copies across two large match arms; clean fix needs restructuring with regression risk, left for a focused follow-up.
Clean (no actionable findings): RSA/EC/DH/bignum and x509/DER core verification logic.

Auditor scopes: TLS · QUIC · DTLS · x509+DER · RSA/EC/DH/bignum · PQC(ML-KEM/ML-DSA/SLH-DSA/LMS/XMSS) · symmetric+RNG+KDF+HPKE · FFI+CLI.

🤖 Generated with Claude Code

Track the 1-RTT rx key phase (`CryptoState::rx_phase`) separately from the tx key phase (`one_rtt_phase`). The receive path previously used the tx-derived `one_rtt_phase` as the reference for deciding whether an inbound packet represented a peer key update. After a self-initiated update (`initiate_key_update`/`flip_tx_key_phase`) the tx phase advances while the peer keeps sending at the old phase; a normal in-flight old-phase peer packet — which still decrypts under the unchanged old-phase rx keys — was misread as a peer key update and ran a spurious `commit_rx_key_phase_flip`. That stamped subsequent ciphertext with the wrong phase bit (connection break), wrongly cleared `tx_phase_pending_confirm`, and reset `rx_pn_window` while the same old-phase rx keys stayed live (RFC 9001 §9.5 per-key replay bypass). The receive path now commits a phase flip only when a packet was opened with the NEXT-generation rx keys (pkt_phase != rx_phase && !opened_with_prev), so a packet that decrypts under the keys already installed for the current rx phase never triggers a commit. `flip_tx_key_phase` advances only the tx phase; `commit_rx_key_phase_flip` advances `rx_phase`. The tx-confirm branch now fires only when the peer's packet is at our new tx phase (pkt_phase == one_rtt_phase), so an in-flight old-phase packet can no longer clear the pending-confirm flag. Adds `key_update_initiator_ignores_in_flight_old_phase_packet`, which feeds an old-phase peer packet to the side that INITIATED the update and asserts the tx phase, pending-confirm flag, and replay window are all unaffected (and that the update still confirms normally afterwards). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

`AckRanges::insert` created an unbounded number of disjoint ranges. An on-path attacker who synthesizes AEAD-valid packets (e.g. Initial) with gapped packet numbers could add one range per packet, turning the O(n) scan-per-insert into O(n²) CPU and O(n) memory before the PN space is discarded. Cap the set at `MAX_ACK_RANGES` (32) disjoint ranges. The ranges are stored descending, so when the cap is exceeded the lowest/oldest range (smallest PNs, least useful to keep ACKing) is evicted; a brand-new range below all existing ones at capacity is dropped outright. Dropping an ACK range is always safe — ACKs are informational and the peer retransmits as needed. ACK encoding (`build_ack_ranges_raw`) iterates the now-bounded set, so the encoded frame is bounded too. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

`note_retire_prior_to` removed every entry below the new `retire_prior_to` unconditionally — including the active outbound DCID (typically seq 0) that `endpoint.cids.peer` still writes into every short header — with no rotation, leaving the connection sending a CID the peer has asked us to retire. When `retire_prior_to` covers the active CID, rotate `active_seq` to the lowest surviving entry whose sequence is >= the new value before removing the old ones. If no such replacement exists yet (the peer advanced `retire_prior_to` past every CID it has issued), RETAIN the active entry so a usable outbound DCID survives until a replacement NEW_CONNECTION_ID arrives — degrading gracefully rather than breaking the connection. RFC 9000 §5.1.2-consistent. The NEW_CONNECTION_ID handler now re-syncs `endpoint.cids.peer` to the pool's (possibly rotated) active CID after processing the frame, so the outbound DCID always tracks the active entry. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ENDING When STOP_SENDING has been sent on a stream, `RecvStream::on_data` silently drops subsequently-received payload (returns Ok(0)). But `Streams::on_stream` had already charged those high-water bytes to `conn_recv_used`; they were never added to `conn_consumed`, so `maybe_replenish_conn` never re-granted them. A peer that keeps sending after STOP_SENDING without ever resetting could thereby exhaust the connection-level flow-control window. Mirror `on_reset`: count the discarded post-STOP_SENDING bytes as consumed so connection-level credit is returned. A new per-stream `conn_fc_credited` counter tracks the bytes already credited (reads + discards) so a later RESET_STREAM credits only the not-yet-credited remainder — no double-count, no leak. The reset path now anchors its discard credit on `conn_fc_credited` instead of `read_off` (equal when nothing was discarded out-of-band). Adds tests for the post-STOP_SENDING flood (credit returned per byte, retransmits not double-counted) and for a following RESET_STREAM with a larger final size. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

MEDIUM — an off-path attacker could abort or derail an in-flight DTLS handshake with a single spoofed epoch-0 plaintext handshake record. The `authenticated` flag only silenced `read_fragment` framing errors; the subsequent `reassembler.feed` / `dispatch_one` / `pop_ready` calls used `?` and propagated fatally regardless of authentication. Fix, mirroring the DTLS 1.3 server pre-cookie wrapper (errors on the unauthenticated path become silent drops, only the local `InappropriateState` misconfig stays fatal): - client12 (`WaitServerHelloOrHvr`/`WaitServerHello`): a spoofed ServerHello that fails handshake-layer validation is now silently dropped. The silent-drop is scoped to the pre-ServerHello states so a genuine post-SH plaintext flight fault (e.g. a certificate that does not match the requested host) stays fatal. - client12 HVR path: a malformed/unexpected HVR is a silent drop, and an HVR that arrives once one has already been consumed is ignored without mutating transcript/state/reassembler (new `hvr_consumed` flag). All HVR-typed fragments are captured here so a spoof never enters the reassembler. - client13 (`WaitServerHello`): spoofed ServerHello/HRR faults become silent drops on the unauthenticated path. - server12 (`WaitClientFlight`): a spoofed plaintext ClientKeyExchange with a bad EC point is silently dropped; the encrypted client Finished (epoch >= 1) keeps every error fatal. `feed`/`pop_ready` advance `expected_msg_seq` before dispatch, so on a silent drop the reassembler is also rewound (new `rewind_expected_msg_seq`): otherwise a spoofed but well-framed ServerHello would pin `expected_msg_seq` past the genuine SH (same message_seq), which would then be rejected as stale — a derail. The authenticated (epoch >= 1) path is unchanged: every error stays fatal. Adds tests forging a plaintext HVR / ServerHello at the DTLS 1.2 and 1.3 clients mid-handshake and asserting the handshake still completes with the genuine peer. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…bytes LOW (hardening) — pre-cookie ClientHello reassembly used "first writer wins" for overlapping fragment bytes, so a spoofed fragment that arrived first could pin corrupt content and cause the genuine overlapping bytes to be silently discarded, wedging the CH decode (DoS-only). `feed` now pre-scans the incoming fragment: if any byte it covers was already received but DISAGREES with the buffered content, the whole fragment is dropped without mutating state. Idempotent duplicate fragments (identical overlapping bytes) still agree and remain accepted, so legitimate retransmission/overlap is unaffected. Adds a bounds pre-check so no bytes are written if any index is out of range. Adds tests: a conflicting spoof over already-held genuine bytes is dropped (genuine content survives and completes the message), and an agreeing idempotent overlap is still accepted. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The TLS 1.3 core caps its handshake-message reassembly buffer at MAX_HANDSHAKE_REASSEMBLY (128 KiB) via append_handshake_bytes, but the TLS 1.2 and legacy (SSLv3/1.0/1.1) engines reassembled into their own hs_pending with a bare extend_from_slice and no ceiling. pop_handshake only drains once a complete `4 + u24_len` message is present, so a peer could send a handshake header claiming up to ~16 MiB and slow-drip fragments that never complete, pinning ~16 MiB per connection pre-authentication (memory DoS). Gate both hs_pending appends (the decrypted-plain and plaintext-fragment paths) in each engine's next_message behind a shared append_handshake_bytes helper that returns RecordOverflow once the buffer would exceed the existing MAX_HANDSHAKE_REASSEMBLY constant (now reused from common.rs rather than redefined). The legacy engine shares these code paths, so it is covered by the same fix. Adds a test per engine that a giant length-claim plus dribbled fragments is rejected with RecordOverflow. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

on_encrypted_extensions walked the EE extensions by hand with a `seen: Vec<u16>` + `seen.contains(&ty)` duplicate check per extension and no count cap, while the shared codec parse_extensions caps at MAX_EXTENSIONS = 64 precisely to keep that linear scan from going quadratic. The EE body is bounded only by the 128 KiB reassembly cap, so a server could stuff ~32k minimal extensions into it and force ~5e8 comparisons (O(n^2)) on the client. Add the same MAX_EXTENSIONS ceiling to the manual EE loop (reusing the existing constant, now pub(crate) and re-exported from the codec module), rejecting the over-limit case with Decode before the duplicate scan runs. Duplicate detection and per-extension semantics are unchanged. Adds a test that exactly MAX_EXTENSIONS parses while one more is rejected. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The server's pre-Connected ApplicationData accept gate keyed off the sticky early_data_accepted flag: if self.state != State::Connected && !self.early_data_accepted { ... } After EndOfEarlyData rotates the read key onto the client-handshake key and sets early_data_remaining = None, early_data_accepted stays true, so an ApplicationData record in WaitClientFinished passed the guard and the max_early_data_size budget check was skipped (remaining is None) — plaintext landed in the regular app_in before the client Finished was verified. Reachable only by the legitimate client (it needs the client-handshake key), so this is state-machine laxness rather than an attacker forge, but it is fixed regardless. Gate the pre-Connected accept on the live early-data signal (early_data_remaining.is_some(), set when 0-RTT is accepted and cleared at EndOfEarlyData) instead of the sticky flag, and tear the connection down with a fatal unexpected_message alert on violation (matching the adjacent budget-overflow branch). Genuine 0-RTT early data before EOED is still accepted and budgeted. Adds a loopback test that forges an ApplicationData record under the client-handshake key after EOED and confirms the server rejects it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The three h=40 XMSS^MT sets (Sha2_40_2/4/8_256) carry index_bytes = ceil(40/8) = 5, a 40-bit index field. The post-increment exhaustion sentinel 2^40 needs 41 bits, so storing it via idx_to_bytes truncated to 0: index() then read 0, remaining() reported a full key, and the next sign() re-signed leaf 0 whose WOTS+ one-time key was already spent — one-time-key reuse and forgery. The wrapped state also survived to_bytes()/from_bytes() and passed validate_raw_sk. Treat the key as exhausted one leaf early whenever index_bytes*8 == full_height: the highest signable index becomes 2^h - 2, leaving the all-ones 2^h - 1 as the representable exhausted sentinel (matching the xmss reference). Centralized in Params::exhausted_index(), applied in both XmssMtPrivateKey::sign and validate_raw_sk so a wrapped/edge state cannot be reloaded as usable. Wire format (ceil(h/8)) is unchanged. Added xmssmt_h40_index_does_not_wrap_to_zero, which injects a near-exhaustion raw SK on Sha2_40_8_256 (cheap d=8/tree_height=5 keygen), signs the last permitted leaf, and asserts the next sign returns KeyExhausted and never wraps the index back to 0. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

ML-KEM encapsulate filled a stack m: [u8; 32] from the RNG, passed it to encapsulate_deterministic, but never wiped it. m (with the public H(ek)) fully determines the shared secret. Wipe it with a 0-fill routed through core::hint::black_box after encaps returns, matching the d/z wipe in generate. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

LmsPrivateKey::generate filled local i_id/seed arrays, passed them by reference to from_seed, then dropped them unwiped. HssPrivateKey::generate accumulated each level's (i_id, seed) in a heap Vec that freed with the master seeds still resident. Each seed reconstructs that level's signing capability — sibling modules (slhdsa/xmss/mlkem/mldsa) all wipe the equivalent material. Wipe i_id/seed in LmsPrivateKey::generate before returning, and zero each HSS level tuple's i_id/seed in place after from_levels copies them and before the Vec drops, using the module's existing wipe() (0-fill + black_box) idiom. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…/empty tags) The inherent `Cmac::verify` compared only `expected.len()` bytes of the recomputed tag. An empty `expected` made `tag[..0].ct_eq(&[])` return true, turning `cmac.verify(b"")` into an unconditional accept; any short length `n` dropped forgery resistance to 2^(8n). Require a full 16-byte tag (`expected.len() != 16` is rejected) before the constant-time comparison, matching the already-strict `Mac`-trait path (OUTPUT_LEN = Some(16)). The full-length comparison stays constant time. No in-tree caller relied on truncation. Updated `verify_roundtrip` to assert that truncated (`tag[..8]`) and empty (`b""`) tags now return false while the full 16-byte tag still verifies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

`From<u8> for Choice` masked `value & 1` in release builds, so a full 0xFF/0xFE mask byte (the natural output of constant-time mask computations) was silently truncated to its low bit — e.g. 0xFE became Choice(0). Normalize with `!= 0` semantics so any nonzero byte maps to Choice(1) and only 0 maps to Choice(0), using the branchless byte-nonzero trick `((x | -x) >> 7) & 1` to keep it constant time (no branch on the secret byte). The debug_assert for clean 0/1 inputs is kept. All current call sites feed clean 0/1 bits, so behavior is unchanged for them (verified by grep). Added a test exercising the normalization formula across all 256 byte values. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The scratch `tmp: Vec<u8>` holding decrypted QUIC stream app data was copied to the caller and dropped without scrubbing, unlike every sibling path (pc_tls_recv, pc_quic_recv_datagram, pc_aead_decrypt). Scrub it via wipe_vec after the copy_nonoverlapping and on the read-error early return. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

pc_tls_cfg_set_alpn, pc_quic_cfg_set_alpn and pc_csr_create_rsa called Vec::with_capacity(n) on the raw caller-supplied count before validating anything; a garbage `n` drives handle_alloc_error, which ABORTS the process — the panic guard only catches unwinds. Cap the count first (ALPN <= 256, SAN <= 1024) and return a clean code (Unsupported / NULL) instead of reserving. Additionally validate ALPN protocol names are 1..=255 bytes at set time (RFC 7301): empty is illegal and >255 later trips a release assert in the ClientHello encoder. Both reject cleanly now. Also fold pc_tls_feed's documented `*consumed` contract onto the panic path: initialise *consumed = 0 before touching the engine, so an unwind out of conn.feed (caught by the guard as Internal) can no longer leave the caller reading an uninitialised size_t. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

pc_lms_sign, pc_hss_sign, pc_xmss_sign and pc_xmssmt_sign only checked *out_len in the pre-sign capacity gate. A caller passing out == NULL with a large *out_len passed the gate, sign() irreversibly advanced the OTS index, and only then did out_write return NullPointer — silently consuming a one-time leaf with no signature returned. Reject a NULL output buffer (the expected length is always > 0) BEFORE calling sign(), so no state is burned on a null-output call. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

kex: parse_sec1_scalar / parse_raw_or_hex_32/56 left the decoded EC private scalar (and intermediate hex-decoded copies) unwiped. Scrub the scalar after the key takes its own copy, and the hex-decoded Vec inside the parse helpers. s_server / quic_cli: the 32-byte DTLS cookie secret and QUIC retry secret are passed to the config by value (Copy), leaving the local stack copy live. Wipe it after last use, matching the FFI fix to pc_quic_new (commit 316e8a2). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

pkey `-in key.pem` and pkeyutl `verify -inkey` (raw LMS/HSS/XMSS and SM2/SEC1 private key files) read key material with no permission check, while every other secret-key reader warns. Route both through warn_if_world_readable_key before reading. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

hash (ascon-xof128/cxof128 -len) and kdf (hkdf/pbkdf2/scrypt/argon2/kbkdf -len) did vec![0u8; len] with no bound; a garbage -len OOM-aborts. Apply a 1 GiB cap mirroring the `rand` subcommand and die cleanly on oversize. For HKDF additionally bound -len to the RFC 5869 255*HashLen ceiling so an over-long request is a clean error instead of the raw hkdf_expand assert panic. enc: reject an empty -nonce before touching the cipher (AES-GCM would otherwise accept it; the FFI already rejects it). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

`parse_ip_addresses` lacked the `reader.finish()?` that its dNSName sibling `parse_dns_names` uses to reject bytes after the SAN GeneralNames SEQUENCE. Reads were already bounded by the SAN sub-reader and the SAN sits inside the signed TBSCertificate, so this was a strict-DER consistency gap rather than an exploit. Add the matching finish() for parity; legitimate single- and multi-IP SANs still parse (covered by san_ip_parser_accepts_v4_and_v6). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The legacy RSA-only `Certificate::verify_signature::<LIMBS>` entry point skipped the RFC 5280 §4.1.1.2 inner/outer AlgorithmIdentifier consistency check that `verify_signature_with` performs. It is not exploitable here — acceptance is hardcoded to sha256WithRSAEncryption and verification always uses SHA-256, so a tampered outer algid yields UnsupportedAlgorithm without changing the verified bytes — but the two verify paths now behave uniformly, giving external callers of this typed API the same defense in depth. Public signature unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…and SingleResponse ResponseData (`responses()`) and SingleResponse (`read_single_response`) previously tolerated trailing bytes after their SEQUENCEs. Both are covered by the BasicOCSPResponse signature so this was not malleable, but the gap diverged from the strict-DER finish() discipline used elsewhere in the parser. Skip the optional responseExtensions [1] / singleExtensions [1] (which we do not surface) before calling finish(), so responses that legitimately carry those fields still parse while genuine trailing garbage is rejected. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Formatting drift from the FFI/CLI security-fix worktree (rustfmt version skew); reformatted with the canonical toolchain so cargo fmt --check passes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Two CLI guard blocks from the FFI/CLI security-fix worktree tripped clippy::collapsible_if under the canonical toolchain (1.96.0); rewritten as let-chains. No behavior change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

MagicalTux and others added 25 commits June 12, 2026 08:22

style: normalize rustfmt formatting on CLI -len cap paths

98cec2b

Formatting drift from the FFI/CLI security-fix worktree (rustfmt version skew); reformatted with the canonical toolchain so cargo fmt --check passes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

MagicalTux merged commit 061971a into master Jun 11, 2026
8 checks passed

MagicalTux deleted the security-fixes-2026-06-12 branch June 11, 2026 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Security audit fixes (2026-06-12): QUIC key-update, DTLS spoof-DoS, XMSS^MT exhaustion, CMAC verify, + hardening#35

Security audit fixes (2026-06-12): QUIC key-update, DTLS spoof-DoS, XMSS^MT exhaustion, CMAC verify, + hardening#35
MagicalTux merged 25 commits into
masterfrom
security-fixes-2026-06-12

MagicalTux commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

MagicalTux commented Jun 11, 2026

High

Medium

Low / hardening

Deferred / flagged

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant