Skip to content

Add MPS support to the gemma node#345

Open
Zac (zboyles) wants to merge 2 commits into
Lightricks:masterfrom
zboyles:gemma-mps-support
Open

Add MPS support to the gemma node#345
Zac (zboyles) wants to merge 2 commits into
Lightricks:masterfrom
zboyles:gemma-mps-support

Conversation

@zboyles

@zboyles Zac (zboyles) commented Jan 8, 2026

Copy link
Copy Markdown

Adds MPS (Apple Silicon) support to the Gemma text-encoder node.

Related to #302 (gemma text-encoder portion of MPS support)

Changes

  • fork_rng device guardtorch.random.fork_rng(devices=[...]) only
    accepts CUDA devices; on MPS it raises. The device list is now passed only
    when the model is on CUDA. Seeded generation on CUDA is unchanged.
  • Respect caller-provided dtype_LTXVGemmaTextEncoderModel.__init__
    previously hardcoded torch.bfloat16, overwriting the dtype argument;
    it's now only the default. With this, clip_dtype flows through
    ltxv_gemma_clip into the model and the load_text_embeddings_pipeline
    loaders.
  • float16 on MPS — MPS has incomplete bfloat16 support, so clip_dtype
    falls back to float16 when the torch device is MPS (CUDA keeps
    bfloat16). This also keeps the new torch.autocast(dtype=model.dtype)
    context in _enhance on a dtype MPS actually supports.

Rebased onto current master (single commit, one file, +11/−2). An earlier
revision also propagated dtype into the embeddings connectors; that's no
longer needed — the load_text_embeddings_pipeline refactor on master
already threads it through.

Testing status

  • Logic verified against PyTorch's fork_rng device constraints; the CUDA
    path is preserved unchanged.
  • I don't have the models locally for an end-to-end run — verification on
    Apple Silicon welcome.

@josephlugo

José Lugo (josephlugo) commented Jan 15, 2026

Copy link
Copy Markdown

Can anyone with permissions review this please? M3 Ultra Mac Studio support is needed!

@k06a

Copy link
Copy Markdown

Implemented similar fix here: Comfy-Org/ComfyUI#12809

@zboyles Zac (zboyles) changed the title Untested PR to add MPS support to the gemma node Add MPS support to the gemma node Jun 7, 2026
@zboyles Zac (zboyles) marked this pull request as ready for review June 7, 2026 10:31
Copilot AI review requested due to automatic review settings June 7, 2026 10:31

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR improves device/dtype handling for the Gemma encoder path, aiming to make generation RNG forking safer across backends and to avoid unsupported dtypes on MPS.

Changes:

  • Adjusted torch.random.fork_rng(...) usage to avoid passing a CUDA device on non-CUDA backends.
  • Made dtype actually configurable (instead of always forcing bfloat16) and propagated it into embeddings connector loaders.
  • Added MPS-specific dtype selection (fallback to float16) when loading the model.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gemma_encoder.py Outdated
Comment thread gemma_encoder.py Outdated
@zboyles

Copy link
Copy Markdown
Author

Implemented similar fix here: Comfy-Org/ComfyUI#12809

Thanks, #12809 is complementary rather than overlapping, it fixes device placement (text encoders now land on MPS instead of CPU). This PR fixes the node itself once it's there (fork_rng raises on non-CUDA devices, and MPS needs float16 over bfloat16). With #12809 merged, Mac users now hit these paths by default, so I've rebased this onto current master and trimmed it to the minimal functional diff.

@zboyles

Copy link
Copy Markdown
Author

Can anyone with permissions review this please? M3 Ultra Mac Studio support is needed!

José Lugo (@josephlugo) I've rebased the PR, would you be able to run it on your M3 Ultra and report back?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants