Skip to content

[GPU][NVL-P] Use upconversion for unsupported scales on NVL-P#4960

Merged
kealan-barbieri merged 2 commits into
mainfrom
kealanba/nvf4_nvlp_fixup
Apr 10, 2026
Merged

[GPU][NVL-P] Use upconversion for unsupported scales on NVL-P#4960
kealan-barbieri merged 2 commits into
mainfrom
kealanba/nvf4_nvlp_fixup

Conversation

@kealan-barbieri
Copy link
Copy Markdown
Contributor

@kealan-barbieri kealan-barbieri commented Apr 6, 2026

Description

Fix NVFP4 support on NVL-P, use upconversion to avoid group size limitations on late scaling that conflict with NVFP gs16.

Fixes # MFDNN-14876

EDIT: added fix for last remaining layer failing with OOR in jit:

--matmul --engine=gpu --allow-enum-tags-only=false --check-ref-impl=true --stag=acb --wtag=acb --dtag=abc --attr-fpmath=tf32 16x64x64:16x64x15000_npointnet.tr.tf32.pt.mb16*1

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • Have you formatted the code using clang-format?

Bug fixes

  • Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
  • Have you added relevant regression tests?

@kealan-barbieri kealan-barbieri requested a review from a team as a code owner April 6, 2026 23:42
@github-actions github-actions Bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Apr 6, 2026
@kealan-barbieri
Copy link
Copy Markdown
Contributor Author

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_matmul
enable benchdnn_ip
enable arch_gpu_xe3p-lpg

@kealan-barbieri
Copy link
Copy Markdown
Contributor Author

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_matmul
enable benchdnn_ip
enable arch_gpu_xe3p-lpg

@kealan-barbieri kealan-barbieri merged commit ec0b16e into main Apr 10, 2026
13 of 14 checks passed
@kealan-barbieri kealan-barbieri deleted the kealanba/nvf4_nvlp_fixup branch April 10, 2026 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants