JIT: Transform 'CONST - x' to 'xor x, CONST' when possible by BoyBaykiller · Pull Request #126529 · dotnet/runtime

BoyBaykiller · 2026-04-04T00:48:24Z

Given any CONST - x and range info for x obtained from IntegralRange::ForNode see if we can transform it to x ^ CONST. These changes motivate improvements in IntegralRange::ForNode and RangeOps (#126553). It also made me find a missing GT_NOT in asserprop to enable VN based folding.

Found by looking at diffs from #126442

Example:

int Byte255(byte x)
{
    return 255 - x;
}

;; ------ BASE
G_M64250_IG02:  ;; offset=0x0000
       movzx    rax, dl
       neg      eax
       add      eax, 255
						;; size=10 bbWeight=1 PerfScore 0.75
G_M64250_IG03:  ;; offset=0x000A
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

;; ------ DIFF
G_M64250_IG02:  ;; offset=0x0000
       movzx    rax, dl
       xor      eax, 255
						;; size=8 bbWeight=1 PerfScore 0.50
G_M64250_IG03:  ;; offset=0x0008
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

int IntMin1(int x)
{
    return -1 - x;
}

;; ------ BASE
G_M50835_IG02:  ;; offset=0x0000
       mov      eax, edx
       neg      eax
       dec      eax
						;; size=6 bbWeight=1 PerfScore 0.75
G_M50835_IG03:  ;; offset=0x0006
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00
;; ------ DIFF
G_M50835_IG02:  ;; offset=0x0000
       mov      eax, edx
       not      eax
						;; size=4 bbWeight=1 PerfScore 0.50
G_M50835_IG03:  ;; offset=0x0004
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

dotnet-policy-service · 2026-04-04T00:49:26Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

…loss of data'

…_no_tiered_compilation regression

BoyBaykiller · 2026-04-04T10:35:22Z

There is one type of regression where GetRangeFromAssertions doesn't handle XOR and we fail to remove useless JTRUE. Will fix that in a different PR.

saucecontrol · 2026-04-05T17:55:08Z

As a generalization, the rule is that you can implement subtraction as xor any time you can be sure there won't be a borrow, because binary subtraction is defined as an xor followed by borrow.

This is always true when the left operand is a bit pattern of all ones and the right operand is less than or equal to the left. So naturally, it works for subtraction of any integral type if left has all bits set.

It also happens to work when left has all but the MSB set, because in that case, the only borrow could come from beyond the MSB. This is why the optimization works for int.MaxValue even if subtraction is of type uint (meaning right could be larger).

This also means, for example, that with a byte subtraction, 127 - x can always be implemented as xor, provided you then zero-extend the byte, because any borrow will propagate to all the high bits of the register.

The reason I mention this is that if you have a range assertion that can prove right is smaller for any reason, it can apply there too. We have examples where this optimization is done manually in the libraries code:

runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/BitOperations.cs

Lines 201 to 204 in 937dec6

    
           // LZCNT returns index starting from MSB, whereas BSR gives the index from LSB. 
        
           // 31 ^ BSR here is equivalent to 31 - BSR since the BSR result is always between 0 and 31. 
        
           // This saves an instruction, as subtraction from constant requires either MOV/SUB or NEG/ADD. 
        
           return 31 ^ (int)X86Base.BitScanReverse(value);

BoyBaykiller · 2026-04-05T18:59:08Z

Thanks for the explanation! I was just about to look for the more general case to use in #126553.
Clang seems to catch it: https://godbolt.org/z/e8PjefYKx

Unfortunately I am not sure there is a standard way to get general range information (or LE assertion) in morph. And it's only run if fgGlobalMorph...

As for handling smaller types: I skipped that because I still need to see what's the proper & easy way to see through this IR that we create for small types and get the actual type as defined in the C# source:

CAST      int <- ubyte <- int
*  LCL_VAR   int    V01 arg1          (last use)

saucecontrol · 2026-04-06T06:06:39Z

what's the proper & easy way to see through this IR that we create for small types

In IL, you'll always have a cast back to the small type after an arithmetic operation, because IL doesn't define those operations for anything smaller than int32. Within JIT, small types may be flagged as normalize on load or normalize on store, so JIT will force zero/sign extension as appropriate there.

Unfortunately I am not sure there is a standard way to get general range information (or LE assertion) in morph.

I'm not terribly familiar with that part of the code, but you can fetch assertions from morph. Here's an example that might lead you to the right parts.

runtime/src/coreclr/jit/morph.cpp

Lines 3226 to 3239 in 3f14c31

    
           // Small-typed arguments and aliased locals are normalized on load. Other small-typed locals are 
        
           // normalized on store. If this is one of the former, insert a narrowing cast on the load. 
        
           //         ie. Convert: var-short --> cast-short(var-int) 
        
           // 
        
           if (fgGlobalMorph && lclNode->OperIs(GT_LCL_VAR) && varDsc->lvNormalizeOnLoad() && 
        
               /* TODO-ASG: delete this zero-diff quirk */ lclNode->CanCSE()) 
        
           { 
        
               var_types lclVarType = varDsc->TypeGet(); 
        
               // Assertion prop can tell us to omit adding a cast here. This is useful when the local is a small-typed 
        
               // parameter that is passed in a register: in that case, the ABI specifies that the upper bits might be 
        
               // invalid, but the assertion guarantees us that we have normalized when we wrote it. 
        
               if (optLocalAssertionProp && 
        
                   optAssertionIsSubrange(lclNode, IntegralRange::ForType(lclVarType), apLocal) != NO_ASSERTION_INDEX)

And you can see some example assertions that could be interesting for this optimization if only they were more specific:

runtime/src/coreclr/jit/assertionprop.cpp

Lines 306 to 314 in 3f14c31

    
           case NI_AVX2_LeadingZeroCount: 
        
           case NI_AVX2_TrailingZeroCount: 
        
           case NI_AVX2_X64_LeadingZeroCount: 
        
           case NI_AVX2_X64_TrailingZeroCount: 
        
           case NI_X86Base_PopCount: 
        
           case NI_X86Base_X64_PopCount: 
        
               // Note: No advantage in using a precise range for IntegralRange. 
        
               // Example: IntCns = 42 gives [0..127] with a non -precise range, [42,42] with a precise range. 
        
               return {SymbolicIntegerValue::Zero, SymbolicIntegerValue::ByteMax};

I'm not following exactly why the comment says there's no advantage to a more specific range, though.

* use 'IntegralRange::ForNode' for somewhat better range info (could be improved still) * move to fgOptimizeAddition

BoyBaykiller · 2026-04-07T23:36:43Z

I've generalized the transformation to catch all cases, given CNS - x and range info for x.
I've switched to IntegralRange::ForNode to attain this range info, but as you said its somewhat coarse. Apparently there was a TODO once to make it more precise: #106982. I think this should be revisited.

In theory the case you showed would be naturally handled now if range for BitScanReverse was proper.

Also in order to handle this byte subtract case you mentioned:

byte Byte127(byte x)
{
    return (byte)(127 - x);
}

I need to detect the outer byte cast so I know the range is in [0, 255] which I don't know how to properly do in fgOptimizeAddition. However 255 - x works.

BoyBaykiller · 2026-04-08T02:51:02Z

@EgorBo PTAL

If this gets merged we can justify to add general RangeOps:Xor to rangecheck #126553 as this fixes some regressions from this PR.

I think it would also be nice to revisit making IntegralRange able to handle arbitrary ranges instead of just "symbolic constants". With a precise range it would automatically use XOR here and the code could be simplified to simple subtract:

runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/BitOperations.cs

Lines 201 to 204 in 937dec6

    
           // LZCNT returns index starting from MSB, whereas BSR gives the index from LSB. 
        
           // 31 ^ BSR here is equivalent to 31 - BSR since the BSR result is always between 0 and 31. 
        
           // This saves an instruction, as subtraction from constant requires either MOV/SUB or NEG/ADD. 
        
           return 31 ^ (int)X86Base.BitScanReverse(value);

Copilot

Pull request overview

This PR improves CoreCLR JIT morph optimizations by recognizing when CONST - x (canonicalized as ADD(NEG(x), CONST)) can be safely rewritten as XOR(x, CONST) using IntegralRange::ForNode range information, and it enables additional VN-based folding in assertion propagation.

Changes:

Add an ADD(NEG(x), CONST) => XOR(x, CONST) transform in fgOptimizeAddition guarded by an IntegralRange-based validity check.
Update optVNBasedFoldCurStmt to include GT_NOT among the node kinds eligible for VN-based folding.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`src/coreclr/jit/morph.cpp`	Adds a range-checked transform to rewrite certain constant subtractions into XOR for better codegen.
`src/coreclr/jit/assertionprop.cpp`	Enables VN-based folding for `GT_NOT` nodes by adding it to the foldable opcode set.

Copilot · 2026-04-08T04:18:30Z

src/coreclr/jit/morph.cpp

+                    uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);
+                    mask          = lo | mask;


mask computation can invoke undefined behavior when lo == hi (i.e., lo ^ hi == 0): BitOperations::LeadingZeroCount(0) returns 64, so UINT64_MAX >> 64 is an invalid shift in C++. This can miscompile the JIT itself and may incorrectly allow/deny the transform for singleton ranges. Handle the lo == hi case explicitly (e.g., treat the range mask as just lo, or otherwise ensure the shift count is < 64).

Suggested change

uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);

mask = lo | mask;

uint64_t diff = lo ^ hi;

uint64_t mask = lo;

if (diff != 0)

{

mask |= UINT64_MAX >> BitOperations::LeadingZeroCount(diff);

}

* Add 'CONST - x' -> 'xor x, CONST' transformation

0d5fb4a

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 4, 2026

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 4, 2026

BoyBaykiller added 3 commits April 4, 2026 02:52

* use std::swap

35d21ee

try fix CI failure 'conversion from '__int64' to 'ssize_t', possible …

26c96f3

…loss of data'

* add missing GT_NOT to VN based folding to fix 29009 libraries_tests…

b8a3a15

…_no_tiered_compilation regression

BoyBaykiller changed the title ~~JIT: Transform 'CONST - x' -> 'xor x, CONST' when possible~~ JIT: Transform 'CONST - x' to 'xor x, CONST' when possible Apr 4, 2026

BoyBaykiller marked this pull request as ready for review April 4, 2026 10:27

BoyBaykiller mentioned this pull request Apr 4, 2026

JIT: Add and use RangeOps::Xor #126553

Draft

* generalize to catch all cases

04f1758

* use 'IntegralRange::ForNode' for somewhat better range info (could be improved still) * move to fgOptimizeAddition

build-analysis bot mentioned this pull request Apr 8, 2026

Unable to pull image from mcr.microsoft.com #117164

Open

EgorBo requested a review from Copilot April 8, 2026 04:14

Copilot started reviewing on behalf of EgorBo April 8, 2026 04:15 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Transform 'CONST - x' to 'xor x, CONST' when possible#126529

JIT: Transform 'CONST - x' to 'xor x, CONST' when possible#126529
BoyBaykiller wants to merge 5 commits intodotnet:mainfrom
BoyBaykiller:transform-const-x-to-xor

BoyBaykiller commented Apr 4, 2026 •

edited

Loading

Uh oh!

dotnet-policy-service bot commented Apr 4, 2026

Uh oh!

BoyBaykiller commented Apr 4, 2026

Uh oh!

saucecontrol commented Apr 5, 2026 •

edited

Loading

Uh oh!

BoyBaykiller commented Apr 5, 2026 •

edited

Loading

Uh oh!

saucecontrol commented Apr 6, 2026 •

edited

Loading

Uh oh!

BoyBaykiller commented Apr 7, 2026

Uh oh!

BoyBaykiller commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);
		mask = lo \| mask;

-                    uint64_t mask = UINT64_MAX >> BitOperations::LeadingZeroCount(lo ^ hi);
-                    mask          = lo | mask;
+                    uint64_t diff = lo ^ hi;
+                    uint64_t mask = lo;
+                    if (diff != 0)
+                    {
+                        mask |= UINT64_MAX >> BitOperations::LeadingZeroCount(diff);
+                    }

Conversation

BoyBaykiller commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dotnet-policy-service bot commented Apr 4, 2026

Uh oh!

BoyBaykiller commented Apr 4, 2026

Uh oh!

saucecontrol commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BoyBaykiller commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saucecontrol commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BoyBaykiller commented Apr 7, 2026

Uh oh!

BoyBaykiller commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BoyBaykiller commented Apr 4, 2026 •

edited

Loading

saucecontrol commented Apr 5, 2026 •

edited

Loading

BoyBaykiller commented Apr 5, 2026 •

edited

Loading

saucecontrol commented Apr 6, 2026 •

edited

Loading