[skyrl][tinker] Multi-modal Tinker Sampling by nithinvc · Pull Request #1484 · NovaSky-AI/SkyRL

nithinvc · 2026-04-09T05:32:20Z

Summary

Adds VLM sampling support to RemoteInferenceClient sample endpoint. Finalizes inference side changes for #1200.

Extract _render_for_sample to handle both text-only and image-containing prompts. For text-only prompts, it flattens chunk tokens directly. When images are present, it calls /v1/chat/completions/render to process images, then splices placeholder tokens into the pre-tokenized text stream with adjusted offsets.
Update sample() to pass multi-modal features through to the generate payload when present.

Test plan

Verify text-only sampling still works (no render call made, features is None)
Verify image sampling works end-to-end (render call is made, placeholder tokens are spliced correctly, features are included in the generate payload)
New multi-modal sampling tests in test_vlm_inference_generation.py

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

## Summary Integrates the VLLMRenderer (landed in #1464) into the SkyRL train backend so that VLM training batches include image placeholder tokens and decoded vision tensors (`pixel_values`, `image_grid_thw`). - When using new inference (`_SKYRL_USE_NEW_INFERENCE`), `_to_training_batch` lazily creates a `VLLMRenderer` and renders all `ModelInput`s through it. - Extracts `pixel_values` and `image_grid_thw` from rendered outputs and adds them to the `TrainingInputBatch` as `TensorList` entries (one tensor per batch element, since patch counts vary per image). - Extends `_pad_batch` to handle `TensorList` fields by cycling and cloning entries, matching the existing padding strategy for regular tensors. - Reorders `forward_backward` and `forward` to call `_to_training_batch` before `_sleep_inference_engines`, since the renderer needs the inference servers need to be initialized. Note that this does not wake the KV cache or model GPU memory since that is explicitly done in `save_weights_for_sampler` via the dispatcher. ## E2E Tinker VLM Classifier Curves With #1484 , we can now run tinker vision language recipes against the SkyRL. Merging both closes #1200 Example: ```bash _SKYRL_USE_NEW_INFERENCE=1 uv run --extra tinker --extra fsdp -m skyrl.tinker.api \ --base-model "Qwen/Qwen3-VL-8B-Instruct" \ --backend fsdp \ --backend-config '{"trainer.placement.policy_num_gpus_per_node": 8, "generator.inference_engine.num_engines": 8, "trainer.placement.colocate_all": true, "trainer.use_sample_packing": false}' ``` Cookbook ```bash TINKER_API_KEY=tml-dummy uv run --with tinker --with datasets --with torch python -m \ tinker_cookbook.recipes.vlm_classifier.train \ base_url=http://localhost:8000 \ model_name="Qwen/Qwen3-VL-4B-Instruct" \ dataset=caltech101 ``` Train nll: <img width="1200" height="675" alt="train_nll" src="https://github.com/user-attachments/assets/82e36767-edee-43b7-ab4a-7fbf496c8cbb" /> Val nll: <img width="1200" height="675" alt="val_nll" src="https://github.com/user-attachments/assets/1dc6e96b-7e1b-4ead-bf0e-71e42eab0491" /> Val accuracy: <img width="1200" height="675" alt="accuracy" src="https://github.com/user-attachments/assets/ec6f92b8-a544-42d9-9a00-4c06292e7ae3" />  --- <a href="https://app.devin.ai/review/novasky-ai/skyrl/pull/1496" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a>

nithinvc added 2 commits April 8, 2026 22:01

mm sampling first pass

97e9df6

add types, clean up comment

67f2a5a

nithinvc marked this pull request as ready for review April 9, 2026 16:46

devin-ai-integration bot reviewed Apr 9, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

nithinvc added 4 commits April 9, 2026 09:56

raise error when mismatched token counts

41a32ec

resolve merge conflicts

b12e155

fix docstring

abbe09b

move types

cc1d988

nithinvc mentioned this pull request Apr 10, 2026

[skyrl][tinker] Use VLLMRenderer in SkyRL train backend #1496

Merged

SumanthRH self-assigned this Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[skyrl][tinker] Multi-modal Tinker Sampling#1484

[skyrl][tinker] Multi-modal Tinker Sampling#1484
nithinvc wants to merge 6 commits intoNovaSky-AI:mainfrom
nithinvc:nithinc/mm-sample

nithinvc commented Apr 9, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nithinvc commented Apr 9, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nithinvc commented Apr 9, 2026 •

edited by devin-ai-integration bot

Loading