Skip to content

minor improvements#328

Merged
Smattr merged 6 commits into
mainfrom
smattr/33dfc24a-feed-42c2-8758-9dac9774b357
May 21, 2026
Merged

minor improvements#328
Smattr merged 6 commits into
mainfrom
smattr/33dfc24a-feed-42c2-8758-9dac9774b357

Conversation

@Smattr
Copy link
Copy Markdown
Owner

@Smattr Smattr commented May 20, 2026

No description provided.

Smattr added 6 commits May 19, 2026 18:43
When an enqueue fails due to racing with a dequeue, it needs to undo the
insertion of a state into the upcoming queue tail. This CAS is
guaranteed to succeed. So there is no need to take the overhead of a
`CMPXCHG` in this path. We can instead rephrase this code to preserve
the assertion but weaken the operations involved into non-atomic `MOV`s.

Profiling with some medium sized models indicates this has no
significant effect on performance. But it may help models that have a
different pattern where threads often contend on the same queue.
The style this code is written in is intended to prioritise:
  1. White space independence (e.g. no // comments). This code is
     currently copied as-is into the output but in future we might do
     some sort of white space coalescing.
  2. Density/readability. Hence why this change removes bracing on
     single-line ifs, gaining back a line of display.
I think eventually I will probably decide this does not matter and
accept all clang-format suggestions. But I am still in some phase of
cope.
On x86-64, users should be compiling with `-mcx16`. But if they do not,
double-word CAS is implemented by a libatomic call. In this case it is
unsafe to use AVX stores/loads for double-word write/read (respectively)
because, although they are atomic, they do not synchronise with the
(possibly lock acquiring) double-word CAS.
@Smattr Smattr merged commit 9fa73ba into main May 21, 2026
23 checks passed
@Smattr Smattr deleted the smattr/33dfc24a-feed-42c2-8758-9dac9774b357 branch May 21, 2026 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant