fep(sig-operator): add FlagGems-vllm high-performance fused operator library proposal by huangyiqun · Pull Request #20 · flagos-ai/community

huangyiqun · 2026-05-27T08:28:48Z

FEP: FlagGems-vllm

Adds a FEP document for FlagGems-vllm, a high-performance fused operator library for vLLM inference workloads in the FlagOS ecosystem.

SIG: sig-operator
Status: Provisional
Target: FlagOS 2.1

FlagGems-vllm provides Triton-based fused kernels and vLLM-facing operator implementations for performance-critical paths such as MoE routing, cache update, rotary embedding, FP8 quantization, sequence pack/unpack, and DeepSeek V4 attention helper kernels.

The FEP defines the repository scope, fused operator coverage, packaging approach, test plan, and migration process for keeping vLLM-related fused kernels in sync with FlagGems while exposing them through the standalone flaggems_vllm package.

Repository: https://github.com/flagos-ai/FlagGems-vllm

zckzck · 2026-05-28T06:12:43Z

+
+This design allows the same operator API to be used across supported hardware backends as implementations become available.
+
+### Testing and Benchmarking


Test: Does FlagGems-vllm only support NVIDIA hardware, or does it work with other vendors? If compatible, please list the supported vendors.

The multi backend adaptation and verification are currently underway.

zckzck · 2026-05-28T06:52:10Z

+| Dedicated package import | Run `python -c "import flaggems_vllm; import flaggems_vllm.ops"` after installation. |
+| Fused operator API availability | Verify exported symbols from `flaggems_vllm.ops.__all__` include the migrated vLLM-facing fused operators. |
+| Accuracy coverage | Run `pytest -q tests --collect-only` and targeted tests such as `pytest -q tests/test_moe_align_block_size.py --quick`. |
+| DeepSeek V4 helper coverage | Run the DeepSeek V4 attention helper tests when the matching CUDA/vLLM reference environment is available. |


Specific test methods & test procedures ?

The specific testing methods have been added.

huangyiqun added 2 commits May 27, 2026 08:24

add FlagGems-vllm fep

5706a2a

update

020465b

huangyiqun changed the title ~~fep(sig-operator): add FlagGems-vllm operator library proposal~~ fep(sig-operator): add FlagGems-vllm high-performance fused operator library proposal May 27, 2026

update fep

3c93d3a

zckzck reviewed May 28, 2026

View reviewed changes

huangyiqun and others added 2 commits June 5, 2026 17:29

Merge branch 'flagos-ai:main' into add_flaggems-vllm_fep

8d8e0e8

update

0d84fd5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fep(sig-operator): add FlagGems-vllm high-performance fused operator library proposal#20

fep(sig-operator): add FlagGems-vllm high-performance fused operator library proposal#20
huangyiqun wants to merge 5 commits into
flagos-ai:mainfrom
huangyiqun:add_flaggems-vllm_fep

huangyiqun commented May 27, 2026 •

edited

Loading

Uh oh!

zckzck May 28, 2026

Uh oh!

huangyiqun Jun 5, 2026

Uh oh!

zckzck May 28, 2026

Uh oh!

huangyiqun Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		This design allows the same operator API to be used across supported hardware backends as implementations become available.

		### Testing and Benchmarking

Conversation

huangyiqun commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

FEP: FlagGems-vllm

Uh oh!

zckzck May 28, 2026

Choose a reason for hiding this comment

Uh oh!

huangyiqun Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

zckzck May 28, 2026

Choose a reason for hiding this comment

Uh oh!

huangyiqun Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

huangyiqun commented May 27, 2026 •

edited

Loading