Skip to content

Simplify decompose C reference to a single high multiplication#1177

Draft
mkannwischer wants to merge 1 commit into
mainfrom
mjk/issue-652
Draft

Simplify decompose C reference to a single high multiplication#1177
mkannwischer wants to merge 1 commit into
mainfrom
mjk/issue-652

Conversation

@mkannwischer

Copy link
Copy Markdown
Contributor

Replace the two-step Barrett division (ceil(a/128) then Barrett-divide
by 2GAMMA2/128) with a direct high multiplication by floor(2^N /
2
GAMMA2), mirroring the AArch64 backend. For ML-DSA-44 this is
(a * 1477838209 + 2^47) >> 48; for ML-DSA-65/87 it is
(a * 1074791425 + 2^48) >> 49. Both constants strictly
under-approximate 1/(2*GAMMA2), so half-points round down, matching
the original round-half-down semantics, and the result is exact for
all 0 <= a < Q.

Update the Isabelle attribution in compress/ML-DSA_Compress.thy and
neon_ntt/Barrett_Division_Even.thy.

@oqs-bot

oqs-bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-65, REDUCE-RAM)

Full Results (204 proofs)
Proof Status Current Previous Change
**TOTAL** 1582s 1504s +5.2%
mld_invntt_layer 178s 163s +9%
poly_pointwise_montgomery_c 143s 126s +13%
rej_uniform_native 129s 119s +8%
polyvec_matrix_pointwise_montgomery_yvec 86s 81s +6%
mld_ct_memcmp 71s 62s +15%
mld_ntt_layer 46s 42s +10%
fqmul 45s 37s +22%
polyveck_chknorm 40s 39s +3%
keccakf1600x4_permute_native 24s 22s +9%
mld_attempt_signature_generation 23s 24s -4%
mld_ntt_butterfly_block 23s 21s +10%
poly_chknorm_c 19s 21s -10%
polyt0_unpack 19s 17s +12%
sign_verify_internal 19s 13s +46%
polyveck_decompose 17s 20s -15%
rej_uniform_c 16s 17s -6%
mld_check_pct 15s 13s +15%
polyvecl_chknorm 15s 13s +15%
poly_uniform_eta_4x 12s 11s +9%
keccak_absorb_once_x4 11s 10s +10%
poly_add 11s 12s -8%
rej_uniform 10s 8s +25%
compute_pack_t0_t1 9s 8s +12%
poly_invntt_tomont_c 9s 6s +50%
polyvec_matrix_pointwise_montgomery_row 9s 7s +29%
pointwise_acc_native_x86_64 8s 6s +33%
sign 8s 8s +0%
mld_compute_pack_z 7s 6s +17%
poly_ntt_c 7s 4s +75%
polyeta_unpack 7s 4s +75%
polyveck_caddq 7s 9s -22%
polyveck_reduce 7s 4s +75%
polyvecl_pointwise_acc_montgomery_native 7s 3s +133%
keccak_absorb 6s 7s -14%
mld_keccakf1600_permute_c 6s 7s -14%
nttunpack_native_x86_64 6s 2s +200%
pointwise_acc_native_aarch64 6s 6s +0%
polyt0_pack 6s 4s +50%
polyvecl_ntt 6s 7s -14%
sign_pk_from_sk 6s 6s +0%
pack_sk_rho_key_tr_s2 5s 4s +25%
poly_caddq_c 5s 4s +25%
poly_challenge 5s 3s +67%
poly_shiftl 5s 6s -17%
poly_uniform 5s 5s +0%
polyveck_invntt_tomont 5s 6s -17%
polyvecl_pack_eta 5s 4s +25%
polyz_unpack_c 5s 6s -17%
sign_keypair 5s 4s +25%
sign_keypair_internal 5s 3s +67%
sign_signature_extmu 5s 3s +67%
sign_signature_pre_hash_internal 5s 4s +25%
intt_native_x86_64 4s 4s +0%
keccak_f1600_x4_native_aarch64_v84a 4s 3s +33%
keccak_squeezeblocks_x4 4s 5s -20%
keccakf1600x4_extract_bytes 4s 5s -20%
mld_ct_cmask_nonzero_u32 4s 3s +33%
mld_prepare_domain_separation_prefix 4s 3s +33%
mld_sample_s1_s2 4s 4s +0%
mld_value_barrier_u32 4s 2s +100%
ntt_native_aarch64 4s 3s +33%
pack_sig_h 4s 5s -20%
pack_sk_s1 4s 4s +0%
poly_caddq 4s 2s +100%
poly_chknorm_native_x86_64 4s 2s +100%
poly_decompose_c 4s 2s +100%
poly_decompose_native 4s 4s +0%
poly_ntt_native 4s 3s +33%
poly_power2round 4s 7s -43%
poly_reduce 4s 5s -20%
poly_sub 4s 3s +33%
poly_uniform_gamma1 4s 2s +100%
poly_uniform_gamma1_4x 4s 2s +100%
polyeta_pack 4s 6s -33%
polyveck_pack_w1 4s 6s -33%
polyveck_unpack_eta 4s 4s +0%
polyvecl_uniform_gamma1 4s 2s +100%
polyvecl_uniform_gamma1_serial 4s 3s +33%
polyw1_pack_32 4s 4s +0%
polyw1_pack_88 4s 4s +0%
polyz_pack 4s 3s +33%
polyz_unpack_native 4s 2s +100%
rej_eta_native 4s 2s +100%
rej_uniform_eta_native_aarch64 4s 4s +0%
sign_open 4s 2s +100%
sign_signature_internal 4s 3s +33%
sign_signature_pre_hash_shake256 4s 6s -33%
sign_verify_extmu 4s 3s +33%
sk_s2hat_get_poly 4s 3s +33%
unpack_sk_s2hat 4s 3s +33%
yvec_init 4s 3s +33%
decompose 3s 2s +50%
intt_native_aarch64 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 3s +0%
keccak_f1600_x4_native_avx2 3s 2s +50%
keccak_init 3s 3s +0%
keccakf1600_extract_bytes (big endian) 3s 3s +0%
keccakf1600_xor_bytes (big endian) 3s 2s +50%
keccakf1600x4_permute 3s 2s +50%
keccakf1600x4_xor_bytes 3s 3s +0%
mld_ct_cmask_nonzero_u8 3s 5s -40%
mld_h 3s 4s -25%
mld_keccakf1600x4_extract_bytes_c 3s 2s +50%
mld_keccakf1600x4_xor_bytes_c 3s 3s +0%
mld_polymat_expand_entry 3s 3s +0%
mld_sample_s1_s2_serial 3s 2s +50%
mld_value_barrier_i64 3s 4s -25%
mld_value_barrier_u8 3s 2s +50%
montgomery_reduce 3s 3s +0%
ntt_native_x86_64 3s 3s +0%
pack_sig_c 3s 5s -40%
pointwise_native_aarch64 3s 4s -25%
pointwise_native_x86_64 3s 3s +0%
poly_caddq_native 3s 3s +0%
poly_caddq_native_aarch64 3s 2s +50%
poly_caddq_native_x86_64 3s 2s +50%
poly_chknorm_native 3s 3s +0%
poly_decompose_32_native_aarch64 3s 3s +0%
poly_decompose_88_native_aarch64 3s 1s +200%
poly_invntt_tomont_native 3s 4s -25%
poly_ntt 3s 2s +50%
poly_pointwise_montgomery 3s 3s +0%
poly_uniform_eta 3s 3s +0%
poly_use_hint_native_aarch64 3s 2s +50%
polyt1_pack 3s 2s +50%
polyt1_unpack 3s 4s -25%
polyvecl_unpack_eta 3s 2s +50%
polyvecl_unpack_z 3s 2s +50%
polyw1_pack 3s 7s -57%
polyz_unpack_17_native_aarch64 3s 1s +200%
polyz_unpack_19_native_aarch64 3s 2s +50%
rej_eta 3s 3s +0%
rej_uniform_native_aarch64 3s 2s +50%
shake128_absorb 3s 4s -25%
shake128_finalize 3s 2s +50%
shake128_squeeze 3s 4s -25%
shake256_init 3s 2s +50%
shake256_squeeze 3s 3s +0%
shake256x4_squeezeblocks 3s 3s +0%
sig_unpack_hints 3s 3s +0%
sign_verify 3s 3s +0%
sign_verify_pre_hash_internal 3s 5s -40%
sign_verify_pre_hash_shake256 3s 5s -40%
sk_s1hat_get_poly 3s 1s +200%
sys_check_capability 3s 4s -25%
unpack_sk_s1hat 3s 3s +0%
yvec_get_poly 3s 1s +200%
caddq 2s 4s -50%
fqscale 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccak_finalize 2s 5s -60%
keccak_squeeze 2s 2s +0%
keccakf1600_permute 2s 2s +0%
keccakf1600_xor_bytes 2s 1s +100%
keccakf1600x4_xor_bytes_native 2s 2s +0%
make_hint 2s 2s +0%
mld_ct_abs_i32 2s 3s -33%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_get_optblocker_u32 2s 3s -33%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_ct_sel_int32 2s 1s +100%
mld_keccakf1600_extract_bytes 2s 5s -60%
pack_sig_z 2s 5s -60%
poly_chknorm 2s 1s +100%
poly_chknorm_native_aarch64 2s 1s +100%
poly_decompose 2s 3s -33%
poly_permute_bitrev_to_custom_optional 2s 2s +0%
poly_permute_bitrev_to_custom_optional_native 2s 2s +0%
poly_pointwise_montgomery_native 2s 4s -50%
poly_use_hint 2s 2s +0%
poly_use_hint_c 2s 5s -60%
poly_use_hint_native 2s 2s +0%
polyvec_matrix_expand 2s 4s -50%
polyvec_matrix_expand_serial 2s 2s +0%
polyveck_ntt 2s 2s +0%
polyveck_pack_eta 2s 3s -33%
polyvecl_pointwise_acc_montgomery 2s 2s +0%
polyvecl_pointwise_acc_montgomery_c 2s 4s -50%
polyz_unpack 2s 3s -33%
power2round 2s 3s -33%
rej_eta_c 2s 2s +0%
shake128x4_absorb_once 2s 6s -67%
shake128x4_squeezeblocks 2s 3s -33%
shake256_release 2s 3s -33%
shake256x4_absorb_once 2s 4s -50%
sign_signature 2s 4s -50%
sk_t0hat_get_poly 2s 2s +0%
unpack_pk_t1 2s 4s -50%
unpack_sk 2s 2s +0%
keccak_f1600_x1_native_aarch64 1s 3s -67%
keccak_f1600_x1_native_aarch64_v84a 1s 3s -67%
keccakf1600_permute_native 1s 1s +0%
keccakf1600x4_extract_bytes_native 1s 3s -67%
mld_ct_cmask_neg_i32 1s 2s -50%
poly_invntt_tomont 1s 2s -50%
poly_uniform_4x 1s 2s -50%
reduce32 1s 4s -75%
shake128_init 1s 4s -75%
shake128_release 1s 1s +0%
shake256 1s 3s -67%
shake256_absorb 1s 5s -80%
shake256_finalize 1s 2s -50%
unpack_sk_t0hat 1s 3s -67%
use_hint 1s 3s -67%

@oqs-bot

oqs-bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-44, REDUCE-RAM)

Full Results (204 proofs)
Proof Status Current Previous Change
**TOTAL** 1560s 1453s +7.4%
mld_invntt_layer 169s 147s +15%
poly_pointwise_montgomery_c 134s 125s +7%
rej_uniform_native 125s 119s +5%
polyvec_matrix_pointwise_montgomery_yvec 113s 112s +1%
mld_ct_memcmp 70s 65s +8%
mld_ntt_layer 44s 42s +5%
fqmul 40s 40s +0%
mld_attempt_signature_generation 28s 25s +12%
sign_verify_internal 25s 20s +25%
keccakf1600x4_permute_native 22s 23s -4%
mld_ntt_butterfly_block 22s 21s +5%
poly_chknorm_c 18s 18s +0%
polyt0_unpack 17s 16s +6%
mld_check_pct 16s 12s +33%
poly_uniform_eta_4x 16s 14s +14%
rej_uniform_c 15s 15s +0%
polyeta_unpack 14s 16s -12%
polyz_unpack_c 14s 11s +27%
poly_add 12s 9s +33%
compute_pack_t0_t1 9s 9s +0%
keccak_absorb_once_x4 8s 10s -20%
poly_invntt_tomont_c 8s 8s +0%
polyveck_chknorm 8s 6s +33%
rej_uniform 8s 10s -20%
keccak_absorb 7s 6s +17%
mld_h 7s 3s +133%
mld_keccakf1600_permute_c 7s 8s -12%
poly_decompose_c 7s 7s +0%
poly_reduce 7s 4s +75%
polyvec_matrix_pointwise_montgomery_row 7s 8s -12%
sign_verify_pre_hash_internal 7s 3s +133%
poly_ntt 6s 3s +100%
poly_ntt_native 6s 2s +200%
polyveck_decompose 6s 6s +0%
polyvecl_chknorm 6s 9s -33%
sign 6s 7s -14%
sign_pk_from_sk 6s 7s -14%
sign_signature 6s 3s +100%
keccak_squeezeblocks_x4 5s 4s +25%
mld_compute_pack_z 5s 5s +0%
mld_ct_sel_int32 5s 3s +67%
mld_keccakf1600x4_extract_bytes_c 5s 1s +400%
montgomery_reduce 5s 2s +150%
pointwise_acc_native_aarch64 5s 4s +25%
pointwise_native_x86_64 5s 3s +67%
poly_caddq_c 5s 4s +25%
poly_challenge 5s 5s +0%
poly_pointwise_montgomery 5s 3s +67%
poly_power2round 5s 6s -17%
poly_uniform_gamma1 5s 3s +67%
polyt0_pack 5s 4s +25%
polyveck_reduce 5s 6s -17%
polyvecl_unpack_eta 5s 2s +150%
sign_open 5s 5s +0%
sign_signature_internal 5s 4s +25%
keccak_f1600_x1_native_aarch64 4s 3s +33%
make_hint 4s 4s +0%
mld_ct_cmask_nonzero_u8 4s 1s +300%
mld_polymat_expand_entry 4s 2s +100%
mld_sample_s1_s2 4s 8s -50%
mld_value_barrier_i64 4s 2s +100%
nttunpack_native_x86_64 4s 2s +100%
pack_sig_c 4s 3s +33%
pack_sig_h 4s 5s -20%
pack_sig_z 4s 3s +33%
pointwise_acc_native_x86_64 4s 5s -20%
poly_caddq 4s 3s +33%
poly_chknorm_native 4s 4s +0%
poly_decompose_32_native_aarch64 4s 1s +300%
poly_invntt_tomont_native 4s 2s +100%
poly_ntt_c 4s 3s +33%
poly_shiftl 4s 4s +0%
poly_use_hint_native_aarch64 4s 4s +0%
polyt1_unpack 4s 2s +100%
polyveck_caddq 4s 6s -33%
polyveck_ntt 4s 4s +0%
polyveck_pack_eta 4s 3s +33%
polyveck_pack_w1 4s 2s +100%
polyvecl_ntt 4s 3s +33%
polyvecl_pack_eta 4s 2s +100%
polyvecl_pointwise_acc_montgomery_native 4s 1s +300%
polyvecl_uniform_gamma1 4s 2s +100%
polyvecl_uniform_gamma1_serial 4s 4s +0%
polyz_unpack 4s 3s +33%
rej_eta 4s 4s +0%
rej_eta_native 4s 3s +33%
shake256_absorb 4s 3s +33%
shake256_release 4s 1s +300%
shake256x4_absorb_once 4s 2s +100%
sign_keypair 4s 4s +0%
sign_signature_extmu 4s 4s +0%
sign_signature_pre_hash_internal 4s 4s +0%
sk_t0hat_get_poly 4s 2s +100%
sys_check_capability 4s 3s +33%
unpack_pk_t1 4s 2s +100%
caddq 3s 1s +200%
decompose 3s 1s +200%
intt_native_aarch64 3s 2s +50%
intt_native_x86_64 3s 2s +50%
keccak_f1600_x1_native_aarch64_v84a 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 2s +50%
keccak_init 3s 4s -25%
keccak_squeeze 3s 1s +200%
keccakf1600_xor_bytes (big endian) 3s 2s +50%
keccakf1600x4_permute 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 3s +0%
mld_ct_abs_i32 3s 2s +50%
mld_ct_cmask_neg_i32 3s 1s +200%
mld_ct_cmask_nonzero_u32 3s 1s +200%
mld_keccakf1600_extract_bytes 3s 2s +50%
mld_prepare_domain_separation_prefix 3s 4s -25%
mld_sample_s1_s2_serial 3s 3s +0%
ntt_native_aarch64 3s 3s +0%
ntt_native_x86_64 3s 1s +200%
pack_sk_rho_key_tr_s2 3s 4s -25%
pack_sk_s1 3s 3s +0%
pointwise_native_aarch64 3s 3s +0%
poly_caddq_native 3s 2s +50%
poly_caddq_native_aarch64 3s 1s +200%
poly_chknorm_native_aarch64 3s 5s -40%
poly_chknorm_native_x86_64 3s 4s -25%
poly_decompose 3s 3s +0%
poly_decompose_native 3s 5s -40%
poly_permute_bitrev_to_custom_optional 3s 2s +50%
poly_permute_bitrev_to_custom_optional_native 3s 5s -40%
poly_sub 3s 4s -25%
poly_uniform 3s 5s -40%
poly_uniform_4x 3s 3s +0%
poly_uniform_eta 3s 5s -40%
poly_uniform_gamma1_4x 3s 3s +0%
polyeta_pack 3s 2s +50%
polyveck_invntt_tomont 3s 3s +0%
polyvecl_pointwise_acc_montgomery_c 3s 3s +0%
polyvecl_unpack_z 3s 2s +50%
polyw1_pack_32 3s 3s +0%
polyw1_pack_88 3s 3s +0%
polyz_pack 3s 2s +50%
polyz_unpack_17_native_aarch64 3s 2s +50%
power2round 3s 3s +0%
rej_eta_c 3s 3s +0%
rej_uniform_eta_native_aarch64 3s 2s +50%
shake128_init 3s 6s -50%
shake128_squeeze 3s 2s +50%
shake128x4_squeezeblocks 3s 3s +0%
shake256 3s 2s +50%
shake256_squeeze 3s 4s -25%
shake256x4_squeezeblocks 3s 1s +200%
sig_unpack_hints 3s 2s +50%
sign_keypair_internal 3s 3s +0%
sign_signature_pre_hash_shake256 3s 3s +0%
sign_verify 3s 3s +0%
sign_verify_extmu 3s 2s +50%
sign_verify_pre_hash_shake256 3s 4s -25%
sk_s2hat_get_poly 3s 4s -25%
unpack_sk_t0hat 3s 1s +200%
yvec_init 3s 4s -25%
fqscale 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 1s +100%
keccak_f1600_x4_native_avx2 2s 4s -50%
keccakf1600_permute 2s 4s -50%
keccakf1600_permute_native 2s 1s +100%
keccakf1600x4_extract_bytes 2s 4s -50%
keccakf1600x4_extract_bytes_native 2s 1s +100%
keccakf1600x4_xor_bytes 2s 2s +0%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_keccakf1600x4_xor_bytes_c 2s 1s +100%
mld_value_barrier_u32 2s 1s +100%
mld_value_barrier_u8 2s 2s +0%
poly_chknorm 2s 2s +0%
poly_decompose_88_native_aarch64 2s 4s -50%
poly_invntt_tomont 2s 3s -33%
poly_pointwise_montgomery_native 2s 2s +0%
poly_use_hint 2s 4s -50%
poly_use_hint_c 2s 3s -33%
poly_use_hint_native 2s 4s -50%
polyt1_pack 2s 5s -60%
polyvec_matrix_expand 2s 4s -50%
polyvec_matrix_expand_serial 2s 5s -60%
polyveck_unpack_eta 2s 2s +0%
polyvecl_pointwise_acc_montgomery 2s 3s -33%
polyw1_pack 2s 1s +100%
polyz_unpack_19_native_aarch64 2s 2s +0%
polyz_unpack_native 2s 3s -33%
reduce32 2s 3s -33%
rej_uniform_native_aarch64 2s 5s -60%
shake128_finalize 2s 2s +0%
shake128_release 2s 2s +0%
shake128x4_absorb_once 2s 3s -33%
shake256_init 2s 5s -60%
unpack_sk_s1hat 2s 2s +0%
unpack_sk_s2hat 2s 3s -33%
yvec_get_poly 2s 1s +100%
keccak_finalize 1s 3s -67%
keccakf1600_extract_bytes (big endian) 1s 4s -75%
keccakf1600_xor_bytes 1s 3s -67%
mld_ct_get_optblocker_u8 1s 3s -67%
poly_caddq_native_x86_64 1s 1s +0%
shake128_absorb 1s 3s -67%
shake256_finalize 1s 2s -50%
sk_s1hat_get_poly 1s 3s -67%
unpack_sk 1s 2s -50%
use_hint 1s 1s +0%

@oqs-bot

oqs-bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-87, REDUCE-RAM)

Full Results (204 proofs)
Proof Status Current Previous Change
**TOTAL** 1599s 1624s -1.5%
mld_invntt_layer 171s 172s -1%
polyvec_matrix_pointwise_montgomery_yvec 160s 155s +3%
poly_pointwise_montgomery_c 128s 136s -6%
rej_uniform_native 125s 125s +0%
mld_ct_memcmp 67s 65s +3%
mld_ntt_layer 43s 43s +0%
fqmul 40s 43s -7%
mld_attempt_signature_generation 33s 37s -11%
keccakf1600x4_permute_native 22s 26s -15%
mld_ntt_butterfly_block 22s 23s -4%
sign_verify_internal 21s 22s -5%
poly_chknorm_c 18s 18s +0%
rej_uniform_c 18s 16s +12%
polyveck_decompose 17s 18s -6%
mld_check_pct 16s 13s +23%
polyeta_unpack 16s 14s +14%
polyt0_unpack 16s 17s -6%
poly_uniform_eta_4x 14s 13s +8%
compute_pack_t0_t1 12s 11s +9%
poly_add 12s 11s +9%
poly_invntt_tomont_c 11s 9s +22%
polyvecl_chknorm 10s 10s +0%
keccak_absorb_once_x4 9s 10s -10%
polyvec_matrix_pointwise_montgomery_row 8s 8s +0%
polyvecl_ntt 8s 9s -11%
polyz_unpack_c 8s 6s +33%
rej_uniform 8s 8s +0%
mld_sample_s1_s2 7s 4s +75%
pointwise_acc_native_aarch64 7s 8s -12%
sign 7s 8s -12%
sign_signature_pre_hash_shake256 7s 5s +40%
sign_verify_pre_hash_internal 7s 4s +75%
keccak_absorb 6s 7s -14%
keccakf1600_permute_native 6s 3s +100%
mld_compute_pack_z 6s 3s +100%
mld_keccakf1600_permute_c 6s 6s +0%
ntt_native_aarch64 6s 8s -25%
pointwise_acc_native_x86_64 6s 6s +0%
poly_shiftl 6s 3s +100%
sign_signature_internal 6s 5s +20%
intt_native_aarch64 5s 3s +67%
keccak_squeezeblocks_x4 5s 4s +25%
keccakf1600x4_extract_bytes_native 5s 2s +150%
nttunpack_native_x86_64 5s 4s +25%
pack_sk_rho_key_tr_s2 5s 3s +67%
poly_challenge 5s 6s -17%
poly_chknorm_native_x86_64 5s 4s +25%
poly_invntt_tomont 5s 6s -17%
poly_power2round 5s 5s +0%
polyveck_caddq 5s 5s +0%
polyveck_invntt_tomont 5s 7s -29%
polyveck_ntt 5s 2s +150%
polyveck_reduce 5s 6s -17%
polyvecl_uniform_gamma1 5s 3s +67%
polyz_unpack_native 5s 3s +67%
rej_eta_c 5s 4s +25%
sk_t0hat_get_poly 5s 2s +150%
unpack_sk 5s 4s +25%
fqscale 4s 3s +33%
keccak_init 4s 3s +33%
mld_ct_cmask_nonzero_u32 4s 5s -20%
mld_sample_s1_s2_serial 4s 5s -20%
montgomery_reduce 4s 2s +100%
pack_sig_c 4s 3s +33%
poly_decompose 4s 2s +100%
poly_decompose_c 4s 5s -20%
poly_ntt_c 4s 3s +33%
poly_permute_bitrev_to_custom_optional_native 4s 4s +0%
poly_pointwise_montgomery 4s 3s +33%
poly_uniform 4s 4s +0%
poly_uniform_eta 4s 4s +0%
poly_use_hint_native_aarch64 4s 2s +100%
polyvecl_unpack_eta 4s 2s +100%
polyw1_pack 4s 2s +100%
polyz_pack 4s 1s +300%
polyz_unpack_17_native_aarch64 4s 2s +100%
polyz_unpack_19_native_aarch64 4s 3s +33%
rej_eta_native 4s 3s +33%
rej_uniform_eta_native_aarch64 4s 5s -20%
shake128_absorb 4s 2s +100%
shake128x4_absorb_once 4s 2s +100%
shake256_init 4s 4s +0%
sig_unpack_hints 4s 3s +33%
sign_pk_from_sk 4s 6s -33%
sign_verify 4s 6s -33%
sign_verify_extmu 4s 3s +33%
sk_s2hat_get_poly 4s 3s +33%
unpack_pk_t1 4s 3s +33%
unpack_sk_s2hat 4s 3s +33%
keccak_f1600_x4_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 3s +0%
keccak_squeeze 3s 3s +0%
keccakf1600_extract_bytes (big endian) 3s 4s -25%
keccakf1600_xor_bytes 3s 3s +0%
keccakf1600x4_extract_bytes 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 1s +200%
mld_ct_cmask_neg_i32 3s 2s +50%
mld_h 3s 4s -25%
mld_keccakf1600_extract_bytes 3s 3s +0%
mld_prepare_domain_separation_prefix 3s 3s +0%
mld_value_barrier_i64 3s 3s +0%
ntt_native_x86_64 3s 6s -50%
pointwise_native_aarch64 3s 3s +0%
poly_caddq 3s 4s -25%
poly_caddq_c 3s 3s +0%
poly_chknorm_native 3s 4s -25%
poly_chknorm_native_aarch64 3s 4s -25%
poly_decompose_native 3s 2s +50%
poly_ntt 3s 4s -25%
poly_permute_bitrev_to_custom_optional 3s 2s +50%
poly_reduce 3s 4s -25%
poly_uniform_gamma1 3s 3s +0%
poly_uniform_gamma1_4x 3s 5s -40%
poly_use_hint_c 3s 2s +50%
poly_use_hint_native 3s 3s +0%
polyeta_pack 3s 4s -25%
polyt0_pack 3s 4s -25%
polyt1_pack 3s 4s -25%
polyt1_unpack 3s 3s +0%
polyvec_matrix_expand 3s 2s +50%
polyveck_chknorm 3s 3s +0%
polyveck_pack_eta 3s 2s +50%
polyvecl_pointwise_acc_montgomery 3s 4s -25%
polyvecl_pointwise_acc_montgomery_c 3s 3s +0%
power2round 3s 3s +0%
rej_uniform_native_aarch64 3s 4s -25%
shake128_finalize 3s 2s +50%
shake128_squeeze 3s 1s +200%
shake256 3s 4s -25%
shake256_absorb 3s 2s +50%
shake256_release 3s 1s +200%
shake256_squeeze 3s 3s +0%
sign_keypair 3s 3s +0%
sign_keypair_internal 3s 7s -57%
sign_signature 3s 3s +0%
sign_signature_extmu 3s 4s -25%
sign_signature_pre_hash_internal 3s 3s +0%
sign_verify_pre_hash_shake256 3s 3s +0%
sk_s1hat_get_poly 3s 2s +50%
sys_check_capability 3s 4s -25%
yvec_get_poly 3s 2s +50%
yvec_init 3s 2s +50%
caddq 2s 2s +0%
decompose 2s 3s -33%
intt_native_x86_64 2s 4s -50%
keccak_f1600_x1_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 2s 3s -33%
keccak_finalize 2s 2s +0%
keccakf1600_xor_bytes (big endian) 2s 2s +0%
keccakf1600x4_permute 2s 2s +0%
keccakf1600x4_xor_bytes 2s 2s +0%
make_hint 2s 2s +0%
mld_ct_abs_i32 2s 3s -33%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_ct_sel_int32 2s 2s +0%
mld_polymat_expand_entry 2s 5s -60%
mld_value_barrier_u32 2s 2s +0%
mld_value_barrier_u8 2s 2s +0%
pack_sig_h 2s 3s -33%
pack_sig_z 2s 3s -33%
pointwise_native_x86_64 2s 4s -50%
poly_caddq_native 2s 4s -50%
poly_caddq_native_aarch64 2s 3s -33%
poly_caddq_native_x86_64 2s 3s -33%
poly_chknorm 2s 2s +0%
poly_decompose_32_native_aarch64 2s 2s +0%
poly_decompose_88_native_aarch64 2s 4s -50%
poly_invntt_tomont_native 2s 2s +0%
poly_ntt_native 2s 4s -50%
poly_sub 2s 1s +100%
poly_uniform_4x 2s 3s -33%
poly_use_hint 2s 3s -33%
polyvec_matrix_expand_serial 2s 4s -50%
polyveck_pack_w1 2s 3s -33%
polyveck_unpack_eta 2s 4s -50%
polyvecl_unpack_z 2s 3s -33%
polyw1_pack_88 2s 2s +0%
polyz_unpack 2s 3s -33%
reduce32 2s 5s -60%
rej_eta 2s 3s -33%
shake128_init 2s 2s +0%
shake128_release 2s 2s +0%
shake128x4_squeezeblocks 2s 2s +0%
shake256_finalize 2s 1s +100%
shake256x4_absorb_once 2s 2s +0%
shake256x4_squeezeblocks 2s 3s -33%
sign_open 2s 5s -60%
unpack_sk_s1hat 2s 3s -33%
unpack_sk_t0hat 2s 7s -71%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 2s -50%
keccak_f1600_x4_native_avx2 1s 4s -75%
keccakf1600_permute 1s 2s -50%
mld_ct_cmask_nonzero_u8 1s 3s -67%
mld_ct_get_optblocker_i64 1s 1s +0%
mld_keccakf1600x4_extract_bytes_c 1s 2s -50%
mld_keccakf1600x4_xor_bytes_c 1s 3s -67%
pack_sk_s1 1s 2s -50%
poly_pointwise_montgomery_native 1s 5s -80%
polyvecl_pack_eta 1s 2s -50%
polyvecl_pointwise_acc_montgomery_native 1s 3s -67%
polyvecl_uniform_gamma1_serial 1s 5s -80%
polyw1_pack_32 1s 3s -67%
use_hint 1s 4s -75%

@oqs-bot

oqs-bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-44)

Full Results (204 proofs)
Proof Status Current Previous Change
**TOTAL** 1727s 1752s -1.4%
mld_invntt_layer 284s 284s +0%
rej_uniform_native 144s 143s +1%
polyvecl_pointwise_acc_montgomery_c 116s 123s -6%
poly_pointwise_montgomery_c 91s 88s +3%
mld_ct_memcmp 67s 67s +0%
mld_attempt_signature_generation 58s 62s -6%
mld_ntt_layer 42s 47s -11%
fqmul 41s 43s -5%
sign_verify_internal 28s 27s +4%
polyvec_matrix_expand 27s 26s +4%
keccakf1600x4_permute_native 22s 24s -8%
mld_ntt_butterfly_block 21s 22s -5%
rej_uniform 21s 23s -9%
sign_signature_internal 18s 18s +0%
poly_chknorm_c 17s 19s -11%
polyt0_unpack 17s 15s +13%
mld_check_pct 15s 14s +7%
polyeta_unpack 15s 15s +0%
compute_pack_t0_t1 13s 15s -13%
poly_uniform_eta_4x 13s 12s +8%
rej_uniform_c 13s 12s +8%
polyz_unpack_c 12s 10s +20%
poly_invntt_tomont_c 10s 7s +43%
poly_uniform_4x 10s 10s +0%
polyveck_chknorm 10s 11s -9%
sign 10s 7s +43%
keccak_absorb_once_x4 9s 14s -36%
mld_compute_pack_z 9s 11s -18%
poly_add 9s 10s -10%
polyvec_matrix_pointwise_montgomery_yvec 9s 10s -10%
polyvec_matrix_expand_serial 8s 6s +33%
pointwise_acc_native_aarch64 7s 8s -12%
poly_use_hint_c 7s 6s +17%
mld_h 6s 6s +0%
mld_keccakf1600_permute_c 6s 8s -25%
pointwise_acc_native_x86_64 6s 4s +50%
poly_chknorm_native 6s 2s +200%
poly_uniform_gamma1_4x 6s 4s +50%
polyt0_pack 6s 6s +0%
polyveck_decompose 6s 10s -40%
polyveck_invntt_tomont 6s 6s +0%
sign_keypair_internal 6s 4s +50%
fqscale 5s 3s +67%
keccak_absorb 5s 6s -17%
keccakf1600x4_extract_bytes_native 5s 3s +67%
mld_prepare_domain_separation_prefix 5s 4s +25%
poly_caddq_c 5s 4s +25%
poly_ntt 5s 4s +25%
poly_uniform 5s 7s -29%
polyt1_pack 5s 4s +25%
polyvecl_uniform_gamma1_serial 5s 3s +67%
polyw1_pack 5s 4s +25%
sign_keypair 5s 4s +25%
sign_verify_pre_hash_internal 5s 6s -17%
sign_verify_pre_hash_shake256 5s 4s +25%
unpack_sk_s1hat 5s 3s +67%
use_hint 5s 2s +150%
intt_native_x86_64 4s 4s +0%
keccak_f1600_x1_native_aarch64_v84a 4s 2s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 4s 2s +100%
keccakf1600_xor_bytes (big endian) 4s 3s +33%
keccakf1600x4_permute 4s 3s +33%
make_hint 4s 4s +0%
mld_ct_get_optblocker_i64 4s 2s +100%
mld_value_barrier_i64 4s 1s +300%
nttunpack_native_x86_64 4s 5s -20%
pack_sig_c 4s 2s +100%
pointwise_native_x86_64 4s 2s +100%
poly_caddq_native 4s 4s +0%
poly_challenge 4s 4s +0%
poly_decompose_88_native_aarch64 4s 2s +100%
poly_decompose_c 4s 4s +0%
poly_uniform_gamma1 4s 4s +0%
poly_use_hint_native_aarch64 4s 4s +0%
polyeta_pack 4s 4s +0%
polyveck_ntt 4s 3s +33%
polyvecl_chknorm 4s 5s -20%
polyvecl_pointwise_acc_montgomery_native 4s 2s +100%
rej_eta_native 4s 4s +0%
shake256 4s 1s +300%
sign_pk_from_sk 4s 6s -33%
sign_signature 4s 4s +0%
sign_signature_extmu 4s 3s +33%
sign_signature_pre_hash_internal 4s 4s +0%
sign_signature_pre_hash_shake256 4s 5s -20%
sign_verify_extmu 4s 4s +0%
sk_t0hat_get_poly 4s 4s +0%
keccak_f1600_x1_native_aarch64 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccak_finalize 3s 5s -40%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_permute_native 3s 6s -50%
keccakf1600_xor_bytes 3s 1s +200%
keccakf1600x4_extract_bytes 3s 4s -25%
keccakf1600x4_xor_bytes 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 2s +50%
mld_ct_abs_i32 3s 2s +50%
mld_keccakf1600x4_xor_bytes_c 3s 4s -25%
mld_polymat_expand_entry 3s 2s +50%
mld_sample_s1_s2 3s 3s +0%
mld_sample_s1_s2_serial 3s 3s +0%
ntt_native_aarch64 3s 4s -25%
ntt_native_x86_64 3s 3s +0%
pack_sig_z 3s 3s +0%
pack_sk_rho_key_tr_s2 3s 2s +50%
pack_sk_s1 3s 2s +50%
poly_caddq 3s 3s +0%
poly_chknorm 3s 4s -25%
poly_chknorm_native_aarch64 3s 4s -25%
poly_chknorm_native_x86_64 3s 3s +0%
poly_invntt_tomont 3s 3s +0%
poly_invntt_tomont_native 3s 5s -40%
poly_pointwise_montgomery 3s 2s +50%
poly_power2round 3s 5s -40%
poly_reduce 3s 4s -25%
poly_shiftl 3s 2s +50%
poly_sub 3s 2s +50%
poly_uniform_eta 3s 3s +0%
poly_use_hint_native 3s 2s +50%
polyvec_matrix_pointwise_montgomery_row 3s 3s +0%
polyveck_caddq 3s 2s +50%
polyveck_reduce 3s 4s -25%
polyveck_unpack_eta 3s 2s +50%
polyvecl_ntt 3s 5s -40%
polyvecl_pack_eta 3s 3s +0%
polyvecl_pointwise_acc_montgomery 3s 3s +0%
polyvecl_uniform_gamma1 3s 3s +0%
polyvecl_unpack_z 3s 3s +0%
polyw1_pack_88 3s 4s -25%
polyz_unpack_native 3s 4s -25%
reduce32 3s 1s +200%
rej_uniform_eta_native_aarch64 3s 2s +50%
rej_uniform_native_aarch64 3s 2s +50%
shake128_absorb 3s 2s +50%
shake128_finalize 3s 2s +50%
shake128_init 3s 3s +0%
shake128_release 3s 3s +0%
shake128x4_squeezeblocks 3s 3s +0%
shake256_release 3s 5s -40%
shake256_squeeze 3s 1s +200%
shake256x4_squeezeblocks 3s 1s +200%
sig_unpack_hints 3s 3s +0%
sign_verify 3s 6s -50%
unpack_pk_t1 3s 2s +50%
unpack_sk 3s 3s +0%
unpack_sk_s2hat 3s 1s +200%
unpack_sk_t0hat 3s 4s -25%
yvec_init 3s 3s +0%
caddq 2s 2s +0%
decompose 2s 3s -33%
intt_native_aarch64 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccak_init 2s 1s +100%
keccak_squeeze 2s 2s +0%
keccak_squeezeblocks_x4 2s 5s -60%
mld_ct_cmask_neg_i32 2s 4s -50%
mld_ct_cmask_nonzero_u8 2s 2s +0%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_ct_sel_int32 2s 2s +0%
mld_keccakf1600_extract_bytes 2s 2s +0%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_value_barrier_u32 2s 3s -33%
mld_value_barrier_u8 2s 4s -50%
montgomery_reduce 2s 2s +0%
pointwise_native_aarch64 2s 3s -33%
poly_caddq_native_aarch64 2s 2s +0%
poly_caddq_native_x86_64 2s 4s -50%
poly_decompose 2s 4s -50%
poly_decompose_32_native_aarch64 2s 1s +100%
poly_decompose_native 2s 2s +0%
poly_permute_bitrev_to_custom_optional_native 2s 6s -67%
poly_pointwise_montgomery_native 2s 1s +100%
poly_use_hint 2s 3s -33%
polyt1_unpack 2s 2s +0%
polyveck_pack_eta 2s 2s +0%
polyveck_pack_w1 2s 4s -50%
polyvecl_unpack_eta 2s 4s -50%
polyw1_pack_32 2s 4s -50%
polyz_pack 2s 4s -50%
polyz_unpack 2s 2s +0%
polyz_unpack_17_native_aarch64 2s 2s +0%
polyz_unpack_19_native_aarch64 2s 4s -50%
power2round 2s 2s +0%
rej_eta 2s 1s +100%
rej_eta_c 2s 5s -60%
shake256_absorb 2s 2s +0%
shake256_init 2s 1s +100%
sign_open 2s 6s -67%
sk_s1hat_get_poly 2s 1s +100%
sk_s2hat_get_poly 2s 3s -33%
yvec_get_poly 2s 2s +0%
keccakf1600_permute 1s 1s +0%
mld_ct_cmask_nonzero_u32 1s 2s -50%
pack_sig_h 1s 1s +0%
poly_ntt_c 1s 3s -67%
poly_ntt_native 1s 1s +0%
poly_permute_bitrev_to_custom_optional 1s 6s -83%
shake128_squeeze 1s 2s -50%
shake128x4_absorb_once 1s 3s -67%
shake256_finalize 1s 3s -67%
shake256x4_absorb_once 1s 4s -75%
sys_check_capability 1s 2s -50%

@oqs-bot

oqs-bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-65)

Full Results (204 proofs)
Proof Status Current Previous Change
**TOTAL** 2190s 2040s +7.4%
mld_invntt_layer 311s 286s +9%
polyvecl_pointwise_acc_montgomery_c 232s 193s +20%
rej_uniform_native 154s 149s +3%
polyvec_matrix_expand 137s 130s +5%
poly_pointwise_montgomery_c 109s 99s +10%
mld_ct_memcmp 76s 67s +13%
mld_attempt_signature_generation 67s 66s +2%
sign_verify_internal 64s 64s +0%
sign_signature_internal 48s 48s +0%
mld_ntt_layer 47s 43s +9%
fqmul 43s 43s +0%
polyvec_matrix_expand_serial 28s 25s +12%
mld_ntt_butterfly_block 24s 22s +9%
keccakf1600x4_permute_native 22s 23s -4%
poly_chknorm_c 22s 19s +16%
rej_uniform 22s 22s +0%
mld_check_pct 19s 13s +46%
compute_pack_t0_t1 18s 16s +12%
polyvecl_chknorm 18s 16s +12%
polyt0_unpack 16s 18s -11%
polyveck_decompose 15s 15s +0%
rej_uniform_c 14s 12s +17%
poly_add 13s 10s +30%
polyveck_chknorm 13s 11s +18%
keccak_absorb_once_x4 12s 12s +0%
poly_uniform_4x 12s 11s +9%
poly_uniform_eta_4x 12s 12s +0%
sign 10s 7s +43%
mld_compute_pack_z 9s 10s -10%
polyveck_caddq 9s 7s +29%
polyveck_invntt_tomont 9s 8s +12%
poly_invntt_tomont_c 8s 8s +0%
polyvec_matrix_pointwise_montgomery_yvec 8s 10s -20%
polyveck_ntt 8s 9s -11%
sign_pk_from_sk 8s 7s +14%
unpack_sk_t0hat 8s 6s +33%
mld_keccakf1600_permute_c 7s 7s +0%
mld_sample_s1_s2_serial 7s 9s -22%
polyvecl_pointwise_acc_montgomery_native 7s 3s +133%
keccak_absorb 6s 5s +20%
mld_prepare_domain_separation_prefix 6s 3s +100%
pointwise_acc_native_aarch64 6s 6s +0%
pointwise_acc_native_x86_64 6s 6s +0%
poly_caddq_c 6s 7s -14%
poly_challenge 6s 6s +0%
poly_uniform 6s 3s +100%
polyvecl_ntt 6s 6s +0%
sign_keypair_internal 6s 2s +200%
sign_open 6s 3s +100%
sign_signature_pre_hash_internal 6s 7s -14%
sign_verify_pre_hash_internal 6s 5s +20%
unpack_sk 6s 1s +500%
keccakf1600_extract_bytes (big endian) 5s 2s +150%
keccakf1600x4_extract_bytes_native 5s 7s -29%
keccakf1600x4_xor_bytes_native 5s 2s +150%
poly_sub 5s 3s +67%
poly_uniform_gamma1_4x 5s 3s +67%
poly_use_hint_native_aarch64 5s 2s +150%
polyt0_pack 5s 6s -17%
polyz_unpack_c 5s 7s -29%
sig_unpack_hints 5s 3s +67%
sign_keypair 5s 3s +67%
sk_s1hat_get_poly 5s 2s +150%
intt_native_aarch64 4s 2s +100%
keccak_f1600_x4_native_aarch64_v84a 4s 2s +100%
keccak_squeezeblocks_x4 4s 5s -20%
keccakf1600x4_xor_bytes 4s 3s +33%
mld_ct_get_optblocker_u32 4s 4s +0%
mld_h 4s 3s +33%
montgomery_reduce 4s 2s +100%
ntt_native_aarch64 4s 5s -20%
ntt_native_x86_64 4s 3s +33%
pack_sig_c 4s 4s +0%
pack_sig_h 4s 4s +0%
poly_caddq_native_aarch64 4s 2s +100%
poly_chknorm_native 4s 4s +0%
poly_chknorm_native_aarch64 4s 4s +0%
poly_decompose_88_native_aarch64 4s 2s +100%
poly_ntt_native 4s 2s +100%
poly_permute_bitrev_to_custom_optional 4s 3s +33%
poly_permute_bitrev_to_custom_optional_native 4s 3s +33%
poly_uniform_gamma1 4s 2s +100%
polyveck_pack_eta 4s 1s +300%
polyveck_pack_w1 4s 3s +33%
polyveck_reduce 4s 3s +33%
polyveck_unpack_eta 4s 4s +0%
polyvecl_pack_eta 4s 5s -20%
polyvecl_uniform_gamma1_serial 4s 3s +33%
polyw1_pack_88 4s 3s +33%
polyz_unpack_17_native_aarch64 4s 3s +33%
rej_eta_c 4s 2s +100%
rej_eta_native 4s 4s +0%
rej_uniform_eta_native_aarch64 4s 4s +0%
rej_uniform_native_aarch64 4s 4s +0%
shake256_release 4s 2s +100%
sign_signature_extmu 4s 2s +100%
sign_signature_pre_hash_shake256 4s 4s +0%
sign_verify_pre_hash_shake256 4s 4s +0%
yvec_get_poly 4s 4s +0%
caddq 3s 6s -50%
decompose 3s 4s -25%
fqscale 3s 5s -40%
intt_native_x86_64 3s 3s +0%
keccak_f1600_x4_native_avx2 3s 2s +50%
keccak_squeeze 3s 1s +200%
keccakf1600_xor_bytes 3s 2s +50%
make_hint 3s 3s +0%
mld_ct_cmask_nonzero_u32 3s 4s -25%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_keccakf1600x4_xor_bytes_c 3s 3s +0%
mld_polymat_expand_entry 3s 2s +50%
nttunpack_native_x86_64 3s 6s -50%
pack_sk_rho_key_tr_s2 3s 5s -40%
pointwise_native_aarch64 3s 3s +0%
pointwise_native_x86_64 3s 3s +0%
poly_chknorm 3s 2s +50%
poly_decompose_32_native_aarch64 3s 3s +0%
poly_decompose_c 3s 5s -40%
poly_decompose_native 3s 2s +50%
poly_invntt_tomont 3s 4s -25%
poly_invntt_tomont_native 3s 2s +50%
poly_ntt 3s 3s +0%
poly_ntt_c 3s 2s +50%
poly_pointwise_montgomery_native 3s 3s +0%
poly_power2round 3s 4s -25%
poly_reduce 3s 2s +50%
poly_shiftl 3s 7s -57%
poly_uniform_eta 3s 3s +0%
poly_use_hint 3s 3s +0%
poly_use_hint_c 3s 3s +0%
poly_use_hint_native 3s 3s +0%
polyeta_pack 3s 3s +0%
polyt1_pack 3s 2s +50%
polyt1_unpack 3s 4s -25%
polyvecl_pointwise_acc_montgomery 3s 2s +50%
polyvecl_uniform_gamma1 3s 4s -25%
polyvecl_unpack_eta 3s 3s +0%
polyw1_pack_32 3s 3s +0%
polyz_pack 3s 4s -25%
polyz_unpack_19_native_aarch64 3s 3s +0%
reduce32 3s 3s +0%
shake128_init 3s 5s -40%
shake128_release 3s 2s +50%
shake256 3s 3s +0%
shake256_finalize 3s 3s +0%
sign_signature 3s 3s +0%
sign_verify_extmu 3s 4s -25%
unpack_pk_t1 3s 3s +0%
unpack_sk_s1hat 3s 3s +0%
keccak_f1600_x1_native_aarch64 2s 1s +100%
keccak_f1600_x1_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccak_finalize 2s 2s +0%
keccak_init 2s 3s -33%
keccakf1600_permute 2s 3s -33%
keccakf1600_permute_native 2s 3s -33%
keccakf1600x4_extract_bytes 2s 2s +0%
keccakf1600x4_permute 2s 2s +0%
mld_ct_cmask_neg_i32 2s 3s -33%
mld_ct_get_optblocker_u8 2s 1s +100%
mld_ct_sel_int32 2s 1s +100%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_sample_s1_s2 2s 4s -50%
mld_value_barrier_u8 2s 2s +0%
pack_sig_z 2s 2s +0%
pack_sk_s1 2s 3s -33%
poly_caddq 2s 2s +0%
poly_caddq_native 2s 3s -33%
poly_caddq_native_x86_64 2s 2s +0%
poly_chknorm_native_x86_64 2s 6s -67%
poly_decompose 2s 2s +0%
poly_pointwise_montgomery 2s 2s +0%
polyvec_matrix_pointwise_montgomery_row 2s 2s +0%
polyvecl_unpack_z 2s 3s -33%
polyw1_pack 2s 3s -33%
polyz_unpack 2s 2s +0%
polyz_unpack_native 2s 2s +0%
rej_eta 2s 3s -33%
shake128_finalize 2s 3s -33%
shake128_squeeze 2s 1s +100%
shake128x4_absorb_once 2s 2s +0%
shake128x4_squeezeblocks 2s 2s +0%
shake256_absorb 2s 2s +0%
shake256_init 2s 1s +100%
shake256_squeeze 2s 3s -33%
shake256x4_absorb_once 2s 2s +0%
shake256x4_squeezeblocks 2s 3s -33%
sign_verify 2s 5s -60%
unpack_sk_s2hat 2s 4s -50%
yvec_init 2s 1s +100%
keccakf1600_xor_bytes (big endian) 1s 3s -67%
mld_ct_abs_i32 1s 2s -50%
mld_ct_get_optblocker_i64 1s 1s +0%
mld_keccakf1600_extract_bytes 1s 2s -50%
mld_value_barrier_i64 1s 3s -67%
mld_value_barrier_u32 1s 2s -50%
polyeta_unpack 1s 1s +0%
power2round 1s 2s -50%
shake128_absorb 1s 3s -67%
sk_s2hat_get_poly 1s 2s -50%
sk_t0hat_get_poly 1s 3s -67%
sys_check_capability 1s 2s -50%
use_hint 1s 2s -50%

@oqs-bot

oqs-bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-87)

Full Results (204 proofs)
Proof Status Current Previous Change
**TOTAL** 2379s 2406s -1.1%
polyvecl_pointwise_acc_montgomery_c 307s 319s -4%
mld_invntt_layer 285s 291s -2%
polyvec_matrix_expand 219s 223s -2%
rej_uniform_native 149s 153s -3%
mld_attempt_signature_generation 100s 105s -5%
poly_pointwise_montgomery_c 99s 94s +5%
mld_ct_memcmp 67s 64s +5%
sign_signature_internal 64s 68s -6%
sign_verify_internal 60s 61s -2%
polyvec_matrix_expand_serial 49s 48s +2%
mld_ntt_layer 45s 44s +2%
fqmul 40s 40s +0%
compute_pack_t0_t1 32s 30s +7%
polyvec_matrix_pointwise_montgomery_yvec 28s 30s -7%
mld_ntt_butterfly_block 24s 22s +9%
keccakf1600x4_permute_native 22s 25s -12%
rej_uniform 21s 23s -9%
poly_chknorm_c 20s 17s +18%
mld_check_pct 16s 15s +7%
polyeta_unpack 16s 14s +14%
polyt0_unpack 16s 19s -16%
poly_uniform_eta_4x 13s 12s +8%
rej_uniform_c 11s 13s -15%
poly_add 10s 11s -9%
keccak_absorb_once_x4 9s 10s -10%
poly_invntt_tomont_c 9s 8s +12%
poly_uniform_4x 9s 12s -25%
polyveck_decompose 9s 8s +12%
polyveck_invntt_tomont 9s 9s +0%
polyvecl_ntt 9s 8s +12%
mld_keccakf1600_permute_c 8s 7s +14%
polyveck_caddq 8s 8s +0%
keccak_absorb 7s 6s +17%
mld_sample_s1_s2 7s 8s -12%
pointwise_acc_native_aarch64 7s 7s +0%
pointwise_acc_native_x86_64 7s 8s -12%
poly_challenge 7s 5s +40%
polyveck_ntt 7s 7s +0%
sign_pk_from_sk 7s 5s +40%
sign_signature_extmu 7s 4s +75%
sign_verify_pre_hash_shake256 7s 4s +75%
mld_compute_pack_z 6s 7s -14%
ntt_native_x86_64 6s 3s +100%
poly_caddq_c 6s 7s -14%
poly_decompose_88_native_aarch64 6s 4s +50%
poly_decompose_c 6s 8s -25%
poly_use_hint_native 6s 2s +200%
polyt0_pack 6s 3s +100%
polyveck_chknorm 6s 5s +20%
sign 6s 11s -45%
sign_verify_pre_hash_internal 6s 5s +20%
use_hint 6s 2s +200%
intt_native_aarch64 5s 3s +67%
keccak_finalize 5s 1s +400%
pointwise_native_x86_64 5s 3s +67%
poly_chknorm 5s 3s +67%
poly_decompose_32_native_aarch64 5s 6s -17%
polyveck_pack_w1 5s 3s +67%
polyz_unpack_c 5s 4s +25%
reduce32 5s 3s +67%
rej_eta_c 5s 2s +150%
sign_keypair_internal 5s 5s +0%
sign_signature_pre_hash_internal 5s 5s +0%
sign_signature_pre_hash_shake256 5s 7s -29%
sign_verify_extmu 5s 5s +0%
unpack_sk_t0hat 5s 5s +0%
intt_native_x86_64 4s 4s +0%
keccak_init 4s 2s +100%
keccakf1600_permute 4s 2s +100%
keccakf1600_xor_bytes 4s 3s +33%
keccakf1600_xor_bytes (big endian) 4s 3s +33%
keccakf1600x4_extract_bytes 4s 2s +100%
keccakf1600x4_xor_bytes 4s 2s +100%
make_hint 4s 4s +0%
mld_h 4s 2s +100%
mld_polymat_expand_entry 4s 2s +100%
mld_sample_s1_s2_serial 4s 5s -20%
mld_value_barrier_u32 4s 3s +33%
poly_caddq 4s 2s +100%
poly_caddq_native 4s 5s -20%
poly_caddq_native_aarch64 4s 4s +0%
poly_decompose_native 4s 5s -20%
poly_invntt_tomont_native 4s 4s +0%
poly_permute_bitrev_to_custom_optional_native 4s 4s +0%
poly_power2round 4s 4s +0%
poly_shiftl 4s 3s +33%
polyvec_matrix_pointwise_montgomery_row 4s 7s -43%
polyveck_pack_eta 4s 2s +100%
polyvecl_uniform_gamma1 4s 3s +33%
polyvecl_uniform_gamma1_serial 4s 2s +100%
polyw1_pack_88 4s 3s +33%
polyz_unpack_17_native_aarch64 4s 1s +300%
polyz_unpack_19_native_aarch64 4s 3s +33%
rej_eta 4s 3s +33%
rej_eta_native 4s 5s -20%
shake128_absorb 4s 3s +33%
shake128_release 4s 3s +33%
shake256 4s 2s +100%
sign_keypair 4s 5s -20%
sign_signature 4s 4s +0%
sign_verify 4s 3s +33%
sk_s2hat_get_poly 4s 4s +0%
unpack_pk_t1 4s 3s +33%
yvec_get_poly 4s 4s +0%
caddq 3s 3s +0%
decompose 3s 2s +50%
fqscale 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 1s +200%
keccak_f1600_x4_native_avx2 3s 3s +0%
keccak_squeezeblocks_x4 3s 5s -40%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_permute_native 3s 5s -40%
keccakf1600x4_extract_bytes_native 3s 2s +50%
mld_ct_abs_i32 3s 3s +0%
mld_ct_cmask_nonzero_u32 3s 4s -25%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_keccakf1600_extract_bytes 3s 5s -40%
mld_prepare_domain_separation_prefix 3s 4s -25%
montgomery_reduce 3s 2s +50%
ntt_native_aarch64 3s 3s +0%
nttunpack_native_x86_64 3s 5s -40%
pack_sig_c 3s 5s -40%
pack_sig_h 3s 3s +0%
pack_sig_z 3s 2s +50%
pack_sk_s1 3s 3s +0%
pointwise_native_aarch64 3s 4s -25%
poly_caddq_native_x86_64 3s 2s +50%
poly_chknorm_native 3s 4s -25%
poly_chknorm_native_aarch64 3s 4s -25%
poly_decompose 3s 3s +0%
poly_pointwise_montgomery 3s 2s +50%
poly_pointwise_montgomery_native 3s 3s +0%
poly_reduce 3s 3s +0%
poly_uniform 3s 4s -25%
poly_uniform_eta 3s 4s -25%
poly_uniform_gamma1 3s 4s -25%
poly_uniform_gamma1_4x 3s 4s -25%
poly_use_hint_c 3s 3s +0%
polyveck_reduce 3s 4s -25%
polyvecl_chknorm 3s 4s -25%
polyvecl_pack_eta 3s 3s +0%
polyvecl_pointwise_acc_montgomery 3s 3s +0%
polyvecl_pointwise_acc_montgomery_native 3s 4s -25%
polyvecl_unpack_z 3s 4s -25%
polyz_pack 3s 3s +0%
power2round 3s 2s +50%
rej_uniform_eta_native_aarch64 3s 2s +50%
rej_uniform_native_aarch64 3s 3s +0%
shake128_squeeze 3s 3s +0%
shake256_absorb 3s 3s +0%
shake256_squeeze 3s 4s -25%
shake256x4_absorb_once 3s 2s +50%
shake256x4_squeezeblocks 3s 3s +0%
sign_open 3s 3s +0%
sk_t0hat_get_poly 3s 4s -25%
sys_check_capability 3s 3s +0%
yvec_init 3s 3s +0%
keccak_f1600_x1_native_aarch64 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccak_squeeze 2s 5s -60%
keccakf1600x4_xor_bytes_native 2s 3s -33%
mld_ct_cmask_neg_i32 2s 3s -33%
mld_ct_get_optblocker_u32 2s 1s +100%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_value_barrier_i64 2s 4s -50%
pack_sk_rho_key_tr_s2 2s 1s +100%
poly_invntt_tomont 2s 3s -33%
poly_ntt 2s 5s -60%
poly_ntt_c 2s 2s +0%
poly_ntt_native 2s 3s -33%
poly_permute_bitrev_to_custom_optional 2s 2s +0%
poly_sub 2s 4s -50%
poly_use_hint 2s 2s +0%
poly_use_hint_native_aarch64 2s 2s +0%
polyeta_pack 2s 3s -33%
polyt1_pack 2s 3s -33%
polyt1_unpack 2s 2s +0%
polyveck_unpack_eta 2s 5s -60%
polyvecl_unpack_eta 2s 3s -33%
polyw1_pack 2s 1s +100%
polyw1_pack_32 2s 3s -33%
polyz_unpack 2s 3s -33%
polyz_unpack_native 2s 2s +0%
shake128_finalize 2s 4s -50%
shake128_init 2s 3s -33%
shake128x4_absorb_once 2s 1s +100%
shake128x4_squeezeblocks 2s 3s -33%
shake256_init 2s 3s -33%
shake256_release 2s 2s +0%
sig_unpack_hints 2s 3s -33%
sk_s1hat_get_poly 2s 3s -33%
unpack_sk 2s 4s -50%
unpack_sk_s1hat 2s 3s -33%
unpack_sk_s2hat 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 1s +0%
keccakf1600x4_permute 1s 3s -67%
mld_ct_get_optblocker_i64 1s 3s -67%
mld_ct_sel_int32 1s 5s -80%
mld_keccakf1600x4_xor_bytes_c 1s 3s -67%
mld_value_barrier_u8 1s 2s -50%
poly_chknorm_native_x86_64 1s 3s -67%
shake256_finalize 1s 2s -50%

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 46485 cycles 46485 cycles 1
ML-DSA-44 sign 131073 cycles 131058 cycles 1.00
ML-DSA-44 verify 47306 cycles 47305 cycles 1.00
ML-DSA-65 keypair 81684 cycles 81689 cycles 1.00
ML-DSA-65 sign 215307 cycles 215323 cycles 1.00
ML-DSA-65 verify 79296 cycles 79302 cycles 1.00
ML-DSA-87 keypair 132405 cycles 132467 cycles 1.00
ML-DSA-87 sign 277644 cycles 277549 cycles 1.00
ML-DSA-87 verify 134050 cycles 134119 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 113139 cycles 112736 cycles 1.00
ML-DSA-44 sign 399645 cycles 400878 cycles 1.00
ML-DSA-44 verify 119028 cycles 119432 cycles 1.00
ML-DSA-65 keypair 193658 cycles 192975 cycles 1.00
ML-DSA-65 sign 644542 cycles 649981 cycles 0.99
ML-DSA-65 verify 191939 cycles 192870 cycles 1.00
ML-DSA-87 keypair 318942 cycles 318797 cycles 1.00
ML-DSA-87 sign 822796 cycles 828723 cycles 0.99
ML-DSA-87 verify 325445 cycles 326796 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 43417 cycles 43553 cycles 1.00
ML-DSA-44 sign 130691 cycles 131114 cycles 1.00
ML-DSA-44 verify 45128 cycles 45587 cycles 0.99
ML-DSA-65 keypair 75814 cycles 75908 cycles 1.00
ML-DSA-65 sign 215254 cycles 215310 cycles 1.00
ML-DSA-65 verify 74377 cycles 74636 cycles 1.00
ML-DSA-87 keypair 123621 cycles 123588 cycles 1.00
ML-DSA-87 sign 271463 cycles 272387 cycles 1.00
ML-DSA-87 verify 120973 cycles 120775 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 91331 cycles 91530 cycles 1.00
ML-DSA-44 sign 351183 cycles 352430 cycles 1.00
ML-DSA-44 verify 99090 cycles 99915 cycles 0.99
ML-DSA-65 keypair 154160 cycles 153947 cycles 1.00
ML-DSA-65 sign 570281 cycles 572172 cycles 1.00
ML-DSA-65 verify 159498 cycles 159760 cycles 1.00
ML-DSA-87 keypair 255196 cycles 254922 cycles 1.00
ML-DSA-87 sign 722193 cycles 726518 cycles 0.99
ML-DSA-87 verify 263128 cycles 263741 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 55431 cycles 55164 cycles 1.00
ML-DSA-44 sign 159071 cycles 159123 cycles 1.00
ML-DSA-44 verify 57563 cycles 57907 cycles 0.99
ML-DSA-65 keypair 96226 cycles 95536 cycles 1.01
ML-DSA-65 sign 263817 cycles 263540 cycles 1.00
ML-DSA-65 verify 96247 cycles 96187 cycles 1.00
ML-DSA-87 keypair 154986 cycles 154719 cycles 1.00
ML-DSA-87 sign 322492 cycles 322467 cycles 1.00
ML-DSA-87 verify 151333 cycles 150859 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 133161 cycles 133031 cycles 1.00
ML-DSA-44 sign 516727 cycles 518961 cycles 1.00
ML-DSA-44 verify 146251 cycles 146356 cycles 1.00
ML-DSA-65 keypair 224163 cycles 223780 cycles 1.00
ML-DSA-65 sign 844780 cycles 843408 cycles 1.00
ML-DSA-65 verify 233848 cycles 234043 cycles 1.00
ML-DSA-87 keypair 370291 cycles 367594 cycles 1.01
ML-DSA-87 sign 1065740 cycles 1061107 cycles 1.00
ML-DSA-87 verify 382861 cycles 380814 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 112523 cycles 112521 cycles 1.00
ML-DSA-44 sign 354964 cycles 354071 cycles 1.00
ML-DSA-44 verify 117553 cycles 117392 cycles 1.00
ML-DSA-65 keypair 194393 cycles 194710 cycles 1.00
ML-DSA-65 sign 584208 cycles 584562 cycles 1.00
ML-DSA-65 verify 193325 cycles 193311 cycles 1.00
ML-DSA-87 keypair 321448 cycles 321242 cycles 1.00
ML-DSA-87 sign 749057 cycles 747207 cycles 1.00
ML-DSA-87 verify 318617 cycles 318958 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 47023 cycles 46737 cycles 1.01
ML-DSA-44 sign 140001 cycles 139162 cycles 1.01
ML-DSA-44 verify 49133 cycles 49579 cycles 0.99
ML-DSA-65 keypair 82397 cycles 82474 cycles 1.00
ML-DSA-65 sign 228366 cycles 228196 cycles 1.00
ML-DSA-65 verify 82016 cycles 81813 cycles 1.00
ML-DSA-87 keypair 130805 cycles 129675 cycles 1.01
ML-DSA-87 sign 281569 cycles 279317 cycles 1.01
ML-DSA-87 verify 130616 cycles 128340 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 67406 cycles 67286 cycles 1.00
ML-DSA-44 sign 198364 cycles 198339 cycles 1.00
ML-DSA-44 verify 70204 cycles 70253 cycles 1.00
ML-DSA-65 keypair 119285 cycles 119300 cycles 1.00
ML-DSA-65 sign 326202 cycles 325686 cycles 1.00
ML-DSA-65 verify 116868 cycles 116842 cycles 1.00
ML-DSA-87 keypair 196485 cycles 196569 cycles 1.00
ML-DSA-87 sign 421245 cycles 421883 cycles 1.00
ML-DSA-87 verify 193229 cycles 193336 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 118231 cycles 118077 cycles 1.00
ML-DSA-44 sign 455256 cycles 458854 cycles 0.99
ML-DSA-44 verify 130717 cycles 130666 cycles 1.00
ML-DSA-65 keypair 200754 cycles 201340 cycles 1.00
ML-DSA-65 sign 742358 cycles 741817 cycles 1.00
ML-DSA-65 verify 208879 cycles 209500 cycles 1.00
ML-DSA-87 keypair 330201 cycles 330842 cycles 1.00
ML-DSA-87 sign 938760 cycles 936928 cycles 1.00
ML-DSA-87 verify 342173 cycles 343616 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 212090 cycles 211907 cycles 1.00
ML-DSA-44 sign 753861 cycles 760155 cycles 0.99
ML-DSA-44 verify 228257 cycles 229379 cycles 1.00
ML-DSA-65 keypair 376614 cycles 378157 cycles 1.00
ML-DSA-65 sign 1232908 cycles 1250904 cycles 0.99
ML-DSA-65 verify 370936 cycles 372722 cycles 1.00
ML-DSA-87 keypair 603139 cycles 600977 cycles 1.00
ML-DSA-87 sign 1572054 cycles 1584763 cycles 0.99
ML-DSA-87 verify 616641 cycles 616480 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 127569 cycles 127637 cycles 1.00
ML-DSA-44 sign 436957 cycles 441211 cycles 0.99
ML-DSA-44 verify 135453 cycles 136398 cycles 0.99
ML-DSA-65 keypair 220890 cycles 220759 cycles 1.00
ML-DSA-65 sign 707720 cycles 713650 cycles 0.99
ML-DSA-65 verify 220274 cycles 220740 cycles 1.00
ML-DSA-87 keypair 364568 cycles 365093 cycles 1.00
ML-DSA-87 sign 908294 cycles 921276 cycles 0.99
ML-DSA-87 verify 369293 cycles 370783 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 61928 cycles 61599 cycles 1.01
ML-DSA-44 sign 189843 cycles 189177 cycles 1.00
ML-DSA-44 verify 66558 cycles 66462 cycles 1.00
ML-DSA-65 keypair 111312 cycles 110803 cycles 1.00
ML-DSA-65 sign 316194 cycles 315558 cycles 1.00
ML-DSA-65 verify 110980 cycles 111490 cycles 1.00
ML-DSA-87 keypair 170913 cycles 171547 cycles 1.00
ML-DSA-87 sign 378782 cycles 378591 cycles 1.00
ML-DSA-87 verify 169723 cycles 169379 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 154749 cycles 154402 cycles 1.00
ML-DSA-44 sign 587983 cycles 589538 cycles 1.00
ML-DSA-44 verify 168810 cycles 169828 cycles 0.99
ML-DSA-65 keypair 262287 cycles 263339 cycles 1.00
ML-DSA-65 sign 957036 cycles 966082 cycles 0.99
ML-DSA-65 verify 271611 cycles 272946 cycles 1.00
ML-DSA-87 keypair 432882 cycles 432716 cycles 1.00
ML-DSA-87 sign 1209381 cycles 1216186 cycles 0.99
ML-DSA-87 verify 447652 cycles 447711 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 71369 cycles 71552 cycles 1.00
ML-DSA-44 sign 208953 cycles 208994 cycles 1.00
ML-DSA-44 verify 74780 cycles 74736 cycles 1.00
ML-DSA-65 keypair 125947 cycles 125905 cycles 1.00
ML-DSA-65 sign 345638 cycles 345393 cycles 1.00
ML-DSA-65 verify 124113 cycles 124212 cycles 1.00
ML-DSA-87 keypair 207065 cycles 206527 cycles 1.00
ML-DSA-87 sign 444028 cycles 439858 cycles 1.01
ML-DSA-87 verify 204117 cycles 204429 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 137911 cycles 138034 cycles 1.00
ML-DSA-44 sign 482215 cycles 486060 cycles 0.99
ML-DSA-44 verify 148228 cycles 149072 cycles 0.99
ML-DSA-65 keypair 240961 cycles 241789 cycles 1.00
ML-DSA-65 sign 784814 cycles 791605 cycles 0.99
ML-DSA-65 verify 240936 cycles 241325 cycles 1.00
ML-DSA-87 keypair 395872 cycles 396314 cycles 1.00
ML-DSA-87 sign 1006622 cycles 1019348 cycles 0.99
ML-DSA-87 verify 402277 cycles 403755 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 112170 cycles 112136 cycles 1.00
ML-DSA-44 sign 353500 cycles 353794 cycles 1.00
ML-DSA-44 verify 117008 cycles 117198 cycles 1.00
ML-DSA-65 keypair 194777 cycles 194374 cycles 1.00
ML-DSA-65 sign 583911 cycles 583728 cycles 1.00
ML-DSA-65 verify 192722 cycles 193104 cycles 1.00
ML-DSA-87 keypair 320921 cycles 320068 cycles 1.00
ML-DSA-87 sign 747350 cycles 747194 cycles 1.00
ML-DSA-87 verify 318786 cycles 317899 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 211603 cycles 211623 cycles 1.00
ML-DSA-44 sign 751832 cycles 760075 cycles 0.99
ML-DSA-44 verify 227491 cycles 229458 cycles 0.99
ML-DSA-65 keypair 375371 cycles 378162 cycles 0.99
ML-DSA-65 sign 1232745 cycles 1247139 cycles 0.99
ML-DSA-65 verify 369289 cycles 371939 cycles 0.99
ML-DSA-87 keypair 600293 cycles 601521 cycles 1.00
ML-DSA-87 sign 1568166 cycles 1582092 cycles 0.99
ML-DSA-87 verify 613423 cycles 617517 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@hanno-becker hanno-becker left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)

Details
Benchmark suite Current: 9bb8f8d Previous: 0f9bd57 Ratio
ML-DSA-44 keypair 268375 cycles 267651 cycles 1.00
ML-DSA-44 sign 809831 cycles 811144 cycles 1.00
ML-DSA-44 verify 269969 cycles 270460 cycles 1.00
ML-DSA-65 keypair 459702 cycles 459568 cycles 1.00
ML-DSA-65 sign 1315458 cycles 1315519 cycles 1.00
ML-DSA-65 verify 445789 cycles 445438 cycles 1.00
ML-DSA-87 keypair 796724 cycles 786888 cycles 1.01
ML-DSA-87 sign 1805086 cycles 1790709 cycles 1.01
ML-DSA-87 verify 777883 cycles 767214 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: 0f9bd57 Ratio
ML-DSA-44 keypair 463089 cycles 462914 cycles 1.00
ML-DSA-44 sign 2118538 cycles 2132089 cycles 0.99
ML-DSA-44 verify 550906 cycles 554284 cycles 0.99
ML-DSA-65 keypair 781134 cycles 780479 cycles 1.00
ML-DSA-65 sign 3454267 cycles 3479571 cycles 0.99
ML-DSA-65 verify 860012 cycles 864154 cycles 1.00
ML-DSA-87 keypair 1261699 cycles 1265342 cycles 1.00
ML-DSA-87 sign 4274189 cycles 4320477 cycles 0.99
ML-DSA-87 verify 1380932 cycles 1388614 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 760536 cycles 760114 cycles 1.00
ML-DSA-44 sign 3115896 cycles 3141235 cycles 0.99
ML-DSA-44 verify 855663 cycles 859284 cycles 1.00
ML-DSA-65 keypair 1285920 cycles 1285249 cycles 1.00
ML-DSA-65 sign 5085189 cycles 5074895 cycles 1.00
ML-DSA-65 verify 1364204 cycles 1363443 cycles 1.00
ML-DSA-87 keypair 2113049 cycles 2114103 cycles 1.00
ML-DSA-87 sign 6359782 cycles 6362879 cycles 1.00
ML-DSA-87 verify 2229223 cycles 2230277 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)

Details
Benchmark suite Current: 9bb8f8d Previous: ec15c59 Ratio
ML-DSA-44 keypair 223726 cycles 221766 cycles 1.01
ML-DSA-44 sign 613726 cycles 614083 cycles 1.00
ML-DSA-44 verify 218800 cycles 233383 cycles 0.94
ML-DSA-65 keypair 384926 cycles 393803 cycles 0.98
ML-DSA-65 sign 998355 cycles 1023539 cycles 0.98
ML-DSA-65 verify 368420 cycles 377414 cycles 0.98
ML-DSA-87 keypair 663434 cycles 669482 cycles 0.99
ML-DSA-87 sign 1356972 cycles 1392850 cycles 0.97
ML-DSA-87 verify 644004 cycles 643114 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)

Details
Benchmark suite Current: 9bb8f8d Previous: 0f9bd57 Ratio
ML-DSA-44 keypair 302274 cycles 301193 cycles 1.00
ML-DSA-44 sign 1139196 cycles 1138793 cycles 1.00
ML-DSA-44 verify 322598 cycles 328764 cycles 0.98
ML-DSA-65 keypair 558400 cycles 549022 cycles 1.02
ML-DSA-65 sign 1865495 cycles 1904832 cycles 0.98
ML-DSA-65 verify 547034 cycles 529988 cycles 1.03
ML-DSA-87 keypair 890507 cycles 850322 cycles 1.05
ML-DSA-87 sign 2420330 cycles 2373299 cycles 1.02
ML-DSA-87 verify 899250 cycles 877505 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 9bb8f8d Previous: 0f9bd57 Ratio
ML-DSA-65 verify 547034 cycles 529988 cycles 1.03
ML-DSA-87 keypair 890507 cycles 850322 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Replace the two-step Barrett division (ceil(a/128) then Barrett-divide
by 2*GAMMA2/128) with a direct high multiplication by floor(2^N /
2*GAMMA2), mirroring the AArch64 backend. For ML-DSA-44 this is
`(a * 1477838209 + 2^47) >> 48`; for ML-DSA-65/87 it is
`(a * 1074791425 + 2^48) >> 49`. Both constants strictly
under-approximate 1/(2*GAMMA2), so half-points round down, matching
the original round-half-down semantics, and the result is exact for
all 0 <= a < Q.

This collapses a five-op dependency chain (add, asr, mul, add, asr)
into a single signed multiply-add and one shift. On Graviton2 the
scalar a1 step is ~31% faster (1.39 ns -> 0.96 ns per call) for both
parameter sets; the loop is scalar in practice because the ct
value-barriers in mld_decompose block auto-vectorization. AArch64 and
x86_64 builds use their native asm backends and are unaffected.

Update the Isabelle attribution in compress/ML-DSA_Compress.thy and
neon_ntt/Barrett_Division_Even.thy to group the C reference with the
AArch64 backend (direct Barrett division) rather than with AVX2
(divide-by-128 first). The corollary names keep their `aarch64`
suffix for stability; a note in the prose explains they cover the C
reference as well.

Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider simplifying decompose C implementation

3 participants