Skip to content

Add beta cluster remediation CLI commands#369

Merged
blainekasten merged 15 commits into
nextfrom
blaine/tcl-6313-add-node-remediation-apis-to-sdk-and-cli
May 19, 2026
Merged

Add beta cluster remediation CLI commands#369
blainekasten merged 15 commits into
nextfrom
blaine/tcl-6313-add-node-remediation-apis-to-sdk-and-cli

Conversation

@blainekasten
Copy link
Copy Markdown
Contributor

@blainekasten blainekasten commented May 15, 2026

Summary

  • Add tg beta clusters remediations command group with create, list, approve, cancel, and reject commands.
  • Support optional instance_id for list by using the SDK wildcard instance path when omitted.
  • Resolve approve/cancel/reject by remediation ID through SDK list calls before invoking typed SDK action methods.
  • Add remediation CLI help examples and focused CLI tests.

Tests

  • python3 -m ruff check src/together/lib/cli/__init__.py src/together/lib/cli/api/beta/clusters/remediations tests/cli/test_beta_clusters.py
  • python3 -m pyright src/together/lib/cli/api/beta/clusters/remediations src/together/lib/cli/__init__.py
  • python3 -m pytest tests/cli/test_beta_clusters.py -q
Open in Web Open in Cursor 

Images

method image
tg beta clusters remediation --help image
tg beta clusters remediation list image
tg beta clusters remediation create image
tg beta clusters remediation retrieve image
clusters remediation create --help image

@blainekasten blainekasten force-pushed the blaine/tcl-6313-add-node-remediation-apis-to-sdk-and-cli branch 2 times, most recently from cb3ea43 to 9444e98 Compare May 18, 2026 18:13
@blainekasten blainekasten force-pushed the blaine/tcl-6313-add-node-remediation-apis-to-sdk-and-cli branch from 9444e98 to 362807a Compare May 18, 2026 18:38
@blainekasten blainekasten marked this pull request as ready for review May 18, 2026 18:38
@blainekasten blainekasten requested a review from zainhas May 18, 2026 18:41

[dim]-[/dim] Manage node remediations:
[primary]tg beta clusters remediations ls <cluster-id>[/primary]
[primary]tg beta clusters remediations create <cluster-id> <instance-id> --mode REMEDIATION_MODE_VM_ONLY[/primary]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be --mode VM_ONLY? this currently it errors:

▎ $ tg beta clusters remediations create some-cluster some-instance --mode REMEDIATION_MODE_VM_ONLY
▎ Error: Invalid value "REMEDIATION_MODE_VM_ONLY" for --mode. Choose from:
▎ "VM_ONLY", "HOST_AWARE", "EVICT_WITHOUT_REPLACEMENT", "REBOOT_VM".

The BETA_CLUSTERS_REMEDIATIONS_HELP_EXAMPLES block right below uses the short form --mode VM_ONLY, which is what the command actually accepts. Can we update this one to match

Comment thread src/together/lib/cli/api/beta/clusters/remediations/create.py
model_dump = model.model_dump()

# Filter out keys that are not in the model
model_dump = {k: v for k, v in model_dump.items() if k in model.model_fields_set}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model_fields_set filter in model_dump.py changes behavior for evals retrieve and fine_tuning retrieve too, it overrides show_nulls=True. Worth scoping that behind an opt-in flag.

_dump_sorted_model is shared by every retrieve command in the CLI, so filtering on model.model_fields_set silently overrides show_nulls=True for all of them.

I reproduced it locally with a model parsed the same way the SDK parses an API response:

▎ class M(BaseModel):
▎ id: str
▎ name: Optional[str] = None
▎ status: Optional[str] = "idle"
▎ extra: Optional[str] = None

▎ m = M.model_validate({"id": "abc", "status": "running"}) # name/extra not in JSON
▎ print_model_dump(m, show_nulls=True)

▎ Output:
▎ Id: abc
▎ Status: running

name and extra are gone even though the caller asked for nulls. That's a behavior change for:

  • evals/retrieve.py:25 — calls print_model_dump(response) with the default show_nulls=True, so any optional field the API doesn't return will quietly stop showing up.
  • fine_tuning/retrieve.py:37 — uses show_nulls=False, less visible but still a contract change.

@blainekasten blainekasten requested a review from zainhas May 19, 2026 19:42
@blainekasten blainekasten merged commit 59ba233 into next May 19, 2026
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants