VR200 cfgs, Lm3 70b, 405b, qwen3 30b, 235b, gpt-oss, kimi by malay-nagda · Pull Request #3374 · NVIDIA-NeMo/Megatron-Bridge

malay-nagda · 2026-04-17T09:44:26Z

What does this PR do ?

Adds support for VR200 system for the models- Llama3 70B, Llama3 405B, Qwen3 30B_a3B, Qwen3 235B_a22B, GPT-OSS 120B and Kimi-K2.

Changelog

    GPT_OSS_120B_PRETRAIN_CONFIG_VR200_BF16_V2,
    GPT_OSS_120B_PRETRAIN_CONFIG_VR200_FP8_MX_V2

    LLAMA3_70B_PRETRAIN_CONFIG_VR200_BF16_V2,
    LLAMA3_70B_PRETRAIN_CONFIG_VR200_FP8_MX_V2,
    LLAMA3_70B_PRETRAIN_CONFIG_VR200_NVFP4_V2,

    LLAMA31_405B_PRETRAIN_CONFIG_VR200_BF16_V2,
    LLAMA31_405B_PRETRAIN_CONFIG_VR200_FP8_MX_V2,
    LLAMA31_405B_PRETRAIN_CONFIG_VR200_NVFP4_V2,

    QWEN3_30B_A3B_PRETRAIN_CONFIG_VR200_BF16_V1,
    QWEN3_30B_A3B_PRETRAIN_CONFIG_VR200_FP8_MX_V1,

    QWEN3_235B_A22B_PRETRAIN_CONFIG_VR200_BF16_V2,
    QWEN3_235B_A22B_PRETRAIN_CONFIG_VR200_FP8_MX_V2,
    QWEN3_235B_A22B_PRETRAIN_CONFIG_VR200_NVFP4_V2,

    KIMI_K2_PRETRAIN_CONFIG_VR200_BF16,
    KIMI_K2_PRETRAIN_CONFIG_VR200_FP8_MX,

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Summary by CodeRabbit

New Features

Added VR200 hardware target support for multiple AI models: GPT-OSS 120B, Llama (70B and 405B), Kimi K2, and Qwen3 (30B and 235B)
Introduced configuration variants with BF16, FP8_MX, and NVFP4 precision options for supported models on VR200 hardware

Signed-off-by: Malay Nagda <malayn@nvidia.com>

copy-pr-bot · 2026-04-17T09:44:30Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: Malay Nagda <malayn@nvidia.com>

coderabbitai · 2026-04-17T10:09:21Z

📝 Walkthrough

Walkthrough

The PR adds VR200 hardware target support for pretraining configurations across four model families: GPT-OSS 120B, Kimi K2, Llama 70B/405B, and Qwen3 30B/235B. For each model, new VR200-specific configuration factory functions are introduced alongside workload base configuration aliases for multiple precision formats.

Changes

Cohort / File(s)	Summary
GPT-OSS VR200 Support `scripts/performance/configs/gpt_oss/__init__.py`, `scripts/performance/configs/gpt_oss/gpt_oss_llm_pretrain.py`, `scripts/performance/configs/gpt_oss/gpt_oss_workload_base_configs.py`	Added `gpt_oss_120b_pretrain_config_vr200()` factory function with mixed precision support, MoE flex dispatcher backend, and communication overlap configuration. Added BF16_V2 and FP8_MX_V2 workload base config aliases and corresponding exports.
Kimi VR200 Support `scripts/performance/configs/kimi/__init__.py`, `scripts/performance/configs/kimi/kimi_llm_pretrain.py`, `scripts/performance/configs/kimi/kimi_workload_base_configs.py`	Added `kimi_k2_pretrain_config_vr200()` factory function with pipeline layout configuration and gradient reduce overlap. Created BF16 and FP8_MX workload base config aliases referencing existing GB200 configurations.
Llama 70B/405B VR200 Support `scripts/performance/configs/llama/__init__.py`, `scripts/performance/configs/llama/llama3_llm_pretrain.py`, `scripts/performance/configs/llama/llama3_workload_base_configs.py`, `scripts/performance/configs/llama/llama31_llm_pretrain.py`, `scripts/performance/configs/llama/llama31_workload_base_configs.py`	Added `llama3_70b_pretrain_config_vr200()` and `llama31_405b_pretrain_config_vr200()` factory functions with precision-specific tensor-parallel overlap presets, FSDP configuration adjustments, and distributed optimizer settings. Added corresponding BF16_V2, FP8_MX_V2, and NVFP4_V2 workload base config aliases.
Qwen3 VR200 Support `scripts/performance/configs/qwen/__init__.py`, `scripts/performance/configs/qwen/qwen3_llm_pretrain.py`, `scripts/performance/configs/qwen/qwen3_workload_base_configs.py`	Added `qwen3_30b_a3b_pretrain_config_vr200()` and `qwen3_235b_a22b_pretrain_config_vr200()` factory functions with MoE flex dispatcher backend and token dispatcher configuration. Added BF16/FP8_MX variants for 30B and BF16/FP8_MX/NVFP4 variants for 235B as workload base config aliases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Onboard NVFP4 and MXFP8 recipes #2600: Shares modifications to GPT-OSS configuration surfaces (gpt_oss/init.py, gpt_oss_llm_pretrain.py, gpt_oss_workload_base_configs.py) and applies the MoE flex dispatcher backend pattern.

Suggested labels

performance, performance/release, r0.3.0

Suggested reviewers

ko3n1g

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title lists the main hardware target (VR200) and the model configurations being added, accurately summarizing the primary changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Test Results For Major Changes	✅ Passed	PR contains minor configuration additions (~285 lines) aliasing GB200 to VR200 with no algorithm, numerics, or baseline performance changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch malay/vr200_cfgs

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

scripts/performance/configs/kimi/kimi_llm_pretrain.py (1)

140-177: Consider extracting a shared Kimi pretrain-config builder.

This new VR200 function duplicates the same setup path used by other GPU-specific factories. A small internal helper would reduce drift risk between targets.

♻️ Refactor sketch

+def _kimi_k2_pretrain_config_for_gpu(
+    *,
+    gpu: str,
+    precision: str = "bf16",
+    config_variant: str = "v1",
+    optimizer_type: str = "muon",
+) -> ConfigContainer:
+    base_cfg = get_workload_base_config(
+        model_family_name="kimi",
+        model_recipe_name="kimi_k2",
+        gpu=gpu,
+        compute_dtype=precision.upper(),
+        task="pretrain",
+        config_variant=config_variant,
+    )
+    cfg = pretrain_config(optimizer_type=optimizer_type)
+    cfg.mixed_precision = get_precision_config(precision)
+    if base_cfg.moe_flex_dispatcher_backend is not None:
+        cfg.model.moe_flex_dispatcher_backend = base_cfg.moe_flex_dispatcher_backend
+    apply_flex_dispatcher_backend(cfg.model, cfg.model.moe_flex_dispatcher_backend)
+    if base_cfg.pp_layout:
+        cfg.model.pipeline_model_parallel_layout = base_cfg.pp_layout
+    else:
+        cfg.model.pipeline_model_parallel_layout = _get_kimi_k2_pipeline_layout(
+            base_cfg.pipeline_model_parallel_size,
+            base_cfg.virtual_pipeline_model_parallel_size,
+        )
+    set_kimi_k2_common_configs(cfg)
+    set_workload_base_configs(cfg, base_cfg)
+    cfg.comm_overlap.overlap_grad_reduce = True
+    return cfg

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/kimi/kimi_llm_pretrain.py` around lines 140 -
177, The kimi_k2_pretrain_config_vr200 function duplicates GPU-specific setup
logic found in other factory functions; extract a shared builder (e.g.,
build_kimi_k2_pretrain_config) that accepts GPU-specific params (gpu name,
base_cfg) and performs the common steps: call get_workload_base_config, create
cfg via pretrain_config, attach mixed precision via get_precision_config,
conditionally set cfg.model.moe_flex_dispatcher_backend and call
apply_flex_dispatcher_backend, compute or assign pipeline layout (using
_get_kimi_k2_pipeline_layout when base_cfg.pp_layout is empty), call
set_kimi_k2_common_configs and set_workload_base_configs, and set
cfg.comm_overlap.overlap_grad_reduce; then refactor
kimi_k2_pretrain_config_vr200 to call this shared builder with vr200-specific
args so code paths in functions like kimi_k2_pretrain_config_vr200,
apply_flex_dispatcher_backend, set_kimi_k2_common_configs, and
set_workload_base_configs remain consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/performance/configs/gpt_oss/gpt_oss_llm_pretrain.py`:
- Around line 89-113: The function gpt_oss_120b_pretrain_config_vr200 currently
defaults config_variant="v1" which fails VR200 workload lookup; update the
function signature in gpt_oss_120b_pretrain_config_vr200 to use
config_variant="v2" (or remove the default and require the caller to pass the
explicit variant), and ensure any internal uses of config_variant (calls to
get_workload_base_config) continue to pass the corrected value so VR200 entries
in gpt_oss_workload_base_configs.py resolve correctly.

In `@scripts/performance/configs/llama/__init__.py`:
- Around line 36-43: The imported symbols (llama31_405b_pretrain_config_b200,
llama31_405b_pretrain_config_b300, llama31_405b_pretrain_config_gb200,
llama31_405b_pretrain_config_gb300, llama31_405b_pretrain_config_h100,
llama31_405b_pretrain_config_vr200) are flagged as unused (F401); to fix,
explicitly export them by adding an __all__ list that includes each of those
names in the module that currently imports them, or alternatively reference them
in a re-exporting statement so ruff/flake8 recognizes they are intentionally
exposed; update the module's top-level exports accordingly and run ruff
check/format to ensure the lint error is resolved.

In `@scripts/performance/configs/llama/llama3_llm_pretrain.py`:
- Around line 120-153: The function llama3_70b_pretrain_config_vr200 currently
defaults config_variant="v1", which conflicts with VR200 V2-only presets; remove
the arbitrary default so callers must pass an explicit config_variant (change
the signature to config_variant: str with no default), update the function
docstring to mention config_variant is required, and keep the rest of the logic
(get_workload_base_config(..., config_variant=config_variant)) unchanged so
callers must opt into "v2" or other variants explicitly.

In `@scripts/performance/configs/llama/llama31_llm_pretrain.py`:
- Around line 115-148: The function llama31_405b_pretrain_config_vr200 currently
defaults config_variant to "v1" which is incompatible with VR200 presets; change
the config_variant default to "v2" (or remove the default and require callers to
pass the variant) and ensure the get_workload_base_config(...) call uses that
corrected value so workload-base lookup succeeds; optionally add a simple
validation in llama31_405b_pretrain_config_vr200 to raise a clear error if an
unsupported variant (e.g., "v1") is passed.

---

Nitpick comments:
In `@scripts/performance/configs/kimi/kimi_llm_pretrain.py`:
- Around line 140-177: The kimi_k2_pretrain_config_vr200 function duplicates
GPU-specific setup logic found in other factory functions; extract a shared
builder (e.g., build_kimi_k2_pretrain_config) that accepts GPU-specific params
(gpu name, base_cfg) and performs the common steps: call
get_workload_base_config, create cfg via pretrain_config, attach mixed precision
via get_precision_config, conditionally set
cfg.model.moe_flex_dispatcher_backend and call apply_flex_dispatcher_backend,
compute or assign pipeline layout (using _get_kimi_k2_pipeline_layout when
base_cfg.pp_layout is empty), call set_kimi_k2_common_configs and
set_workload_base_configs, and set cfg.comm_overlap.overlap_grad_reduce; then
refactor kimi_k2_pretrain_config_vr200 to call this shared builder with
vr200-specific args so code paths in functions like
kimi_k2_pretrain_config_vr200, apply_flex_dispatcher_backend,
set_kimi_k2_common_configs, and set_workload_base_configs remain consistent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b92e10ce-07f1-4bfa-b898-72b3c33f8bce

📥 Commits

Reviewing files that changed from the base of the PR and between 87ba119 and 828c6ea.

📒 Files selected for processing (14)

scripts/performance/configs/gpt_oss/__init__.py
scripts/performance/configs/gpt_oss/gpt_oss_llm_pretrain.py
scripts/performance/configs/gpt_oss/gpt_oss_workload_base_configs.py
scripts/performance/configs/kimi/__init__.py
scripts/performance/configs/kimi/kimi_llm_pretrain.py
scripts/performance/configs/kimi/kimi_workload_base_configs.py
scripts/performance/configs/llama/__init__.py
scripts/performance/configs/llama/llama31_llm_pretrain.py
scripts/performance/configs/llama/llama31_workload_base_configs.py
scripts/performance/configs/llama/llama3_llm_pretrain.py
scripts/performance/configs/llama/llama3_workload_base_configs.py
scripts/performance/configs/qwen/__init__.py
scripts/performance/configs/qwen/qwen3_llm_pretrain.py
scripts/performance/configs/qwen/qwen3_workload_base_configs.py

coderabbitai · 2026-04-17T10:09:24Z

+def gpt_oss_120b_pretrain_config_vr200(
+    precision: str = "bf16", mock: bool = True, config_variant: str = "v1"
+) -> ConfigContainer:
+    """VR200, baseline config."""
+    base_cfg = get_workload_base_config(
+        model_family_name="gpt_oss",
+        model_recipe_name="gpt_oss_120b",
+        gpu="vr200",
+        compute_dtype=precision.upper(),
+        task="pretrain",
+        config_variant=config_variant,
+    )
+    precision_config = get_precision_config(precision)
+
+    cfg = gpt_oss_120b_pretrain_config()
+    cfg.mixed_precision = precision_config
+    if base_cfg.moe_flex_dispatcher_backend is not None:
+        apply_flex_dispatcher_backend(cfg.model, base_cfg.moe_flex_dispatcher_backend)
+    cfg.comm_overlap = CommOverlapConfig(tp_comm_overlap=bool(base_cfg.tensor_model_parallel_size > 1))
+    cfg.comm_overlap.tp_comm_overlap = False if precision == "nvfp4" else cfg.comm_overlap.tp_comm_overlap
+    set_gpt_oss_common_configs(cfg)
+    set_workload_base_configs(cfg, base_cfg)
+
+    return cfg
+


⚠️ Potential issue | 🔴 Critical

Default config_variant="v1" breaks VR200 lookup.

At Line 90, the default variant is "v1", but VR200 workload entries in scripts/performance/configs/gpt_oss/gpt_oss_workload_base_configs.py are V2-only. A default call will fail config resolution.

💡 Proposed fix

def gpt_oss_120b_pretrain_config_vr200( - precision: str = "bf16", mock: bool = True, config_variant: str = "v1" + precision: str = "bf16", mock: bool = True, config_variant: str = "v2" ) -> ConfigContainer:

As per coding guidelines, "Do not add arbitrary defaults for configs; be as explicit as possible."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/performance/configs/gpt_oss/gpt_oss_llm_pretrain.py` around lines 89 - 113, The function gpt_oss_120b_pretrain_config_vr200 currently defaults config_variant="v1" which fails VR200 workload lookup; update the function signature in gpt_oss_120b_pretrain_config_vr200 to use config_variant="v2" (or remove the default and require the caller to pass the explicit variant), and ensure any internal uses of config_variant (calls to get_workload_base_config) continue to pass the corrected value so VR200 entries in gpt_oss_workload_base_configs.py resolve correctly.

coderabbitai · 2026-04-17T10:09:24Z

    from .llama31_llm_pretrain import (
        llama31_405b_pretrain_config_b200,
        llama31_405b_pretrain_config_b300,
        llama31_405b_pretrain_config_gb200,
        llama31_405b_pretrain_config_gb300,
        llama31_405b_pretrain_config_h100,
+        llama31_405b_pretrain_config_vr200,
    )


⚠️ Potential issue | 🟠 Major

Fix CI-blocking F401 on llama31_llm_pretrain imports.

These imports are being flagged unused by Flake8 in the current re-export pattern, which will fail lint in CI.

🛠️ Minimal lint fix

- from .llama31_llm_pretrain import ( + from .llama31_llm_pretrain import ( # noqa: F401 llama31_405b_pretrain_config_b200, llama31_405b_pretrain_config_b300, llama31_405b_pretrain_config_gb200, llama31_405b_pretrain_config_gb300, llama31_405b_pretrain_config_h100, llama31_405b_pretrain_config_vr200, )

As per coding guidelines "`**/*.py`: Use ruff for linting and formatting Python code. Run `uv run ruff check --fix .` and `uv run ruff format .` to fix most issues. CI does not auto-fix linting and formatting issues."

🧰 Tools

🪛 Flake8 (7.3.0)

[error] 36-36: '.llama31_llm_pretrain.llama31_405b_pretrain_config_b200' imported but unused

(F401)

[error] 36-36: '.llama31_llm_pretrain.llama31_405b_pretrain_config_b300' imported but unused

(F401)

[error] 36-36: '.llama31_llm_pretrain.llama31_405b_pretrain_config_gb200' imported but unused

(F401)

[error] 36-36: '.llama31_llm_pretrain.llama31_405b_pretrain_config_gb300' imported but unused

(F401)

[error] 36-36: '.llama31_llm_pretrain.llama31_405b_pretrain_config_h100' imported but unused

(F401)

[error] 36-36: '.llama31_llm_pretrain.llama31_405b_pretrain_config_vr200' imported but unused

(F401)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/performance/configs/llama/__init__.py` around lines 36 - 43, The imported symbols (llama31_405b_pretrain_config_b200, llama31_405b_pretrain_config_b300, llama31_405b_pretrain_config_gb200, llama31_405b_pretrain_config_gb300, llama31_405b_pretrain_config_h100, llama31_405b_pretrain_config_vr200) are flagged as unused (F401); to fix, explicitly export them by adding an __all__ list that includes each of those names in the module that currently imports them, or alternatively reference them in a re-exporting statement so ruff/flake8 recognizes they are intentionally exposed; update the module's top-level exports accordingly and run ruff check/format to ensure the lint error is resolved.

coderabbitai · 2026-04-17T10:09:24Z

+def llama3_70b_pretrain_config_vr200(
+    precision: str = "bf16", mock: bool = True, config_variant: str = "v1"
+) -> ConfigContainer:
+    """VR200, baseline config."""
+    base_cfg = get_workload_base_config(
+        model_family_name="llama",
+        model_recipe_name="llama3_70b",
+        gpu="vr200",
+        compute_dtype=precision.upper(),
+        task="pretrain",
+        config_variant=config_variant,
+    )
+    precision_config = get_precision_config(precision)
+
+    if precision == "bf16":
+        comm_overlap_cfg = userbuffers_bf16_b200_h8192_tp2_mbs1_seqlen8192
+    else:
+        comm_overlap_cfg = userbuffers_fp8_b200_h8192_tp2_mbs1_seqlen8192
+
+    cfg = llama3_70b_pretrain_config()
+    cfg.mixed_precision = precision_config
+    set_llama3_common_configs(cfg)
+    set_workload_base_configs(cfg, base_cfg)
+
+    if cfg.ddp.use_megatron_fsdp:
+        cfg.ddp.fsdp_double_buffer = True
+        cfg.model.gradient_accumulation_fusion = False  # Disabled to avoid functional errors
+        cfg.ddp.suggested_communication_unit_size = 800000000
+
+    cfg.comm_overlap.tp_comm_overlap_cfg = comm_overlap_cfg
+    cfg.comm_overlap.tp_comm_overlap = False if precision == "nvfp4" else cfg.comm_overlap.tp_comm_overlap
+
+    return cfg
+


⚠️ Potential issue | 🔴 Critical

Default VR200 config variant should not be v1.

At Line 121, defaulting config_variant to "v1" conflicts with the newly added VR200 workload presets (V2-only), so default calls will fail.

💡 Proposed fix

def llama3_70b_pretrain_config_vr200( - precision: str = "bf16", mock: bool = True, config_variant: str = "v1" + precision: str = "bf16", mock: bool = True, config_variant: str = "v2" ) -> ConfigContainer:

As per coding guidelines, "Do not add arbitrary defaults for configs; be as explicit as possible."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/performance/configs/llama/llama3_llm_pretrain.py` around lines 120 - 153, The function llama3_70b_pretrain_config_vr200 currently defaults config_variant="v1", which conflicts with VR200 V2-only presets; remove the arbitrary default so callers must pass an explicit config_variant (change the signature to config_variant: str with no default), update the function docstring to mention config_variant is required, and keep the rest of the logic (get_workload_base_config(..., config_variant=config_variant)) unchanged so callers must opt into "v2" or other variants explicitly.

coderabbitai · 2026-04-17T10:09:24Z

+def llama31_405b_pretrain_config_vr200(
+    precision: str = "bf16", mock: bool = True, config_variant: str = "v1"
+) -> ConfigContainer:
+    """VR200, baseline config."""
+    base_cfg = get_workload_base_config(
+        model_family_name="llama",
+        model_recipe_name="llama31_405b",
+        gpu="vr200",
+        compute_dtype=precision.upper(),
+        task="pretrain",
+        config_variant=config_variant,
+    )
+    precision_config = get_precision_config(precision)
+
+    if precision == "bf16":
+        comm_overlap_cfg = userbuffers_bf16_b200_h16384_tp4_cp2_mbs1_seqlen8192
+    else:
+        comm_overlap_cfg = userbuffers_fp8_b200_h16384_tp4_cp2_mbs1_seqlen8192
+
+    cfg = llama31_405b_pretrain_config()
+    cfg.mixed_precision = precision_config
+    set_llama31_common_configs(cfg)
+    set_workload_base_configs(cfg, base_cfg)
+
+    if cfg.ddp.use_megatron_fsdp:
+        cfg.ddp.fsdp_double_buffer = True
+        cfg.model.gradient_accumulation_fusion = False  # Disabled to avoid functional errors
+        cfg.ddp.num_distributed_optimizer_instances = 2
+
+    cfg.comm_overlap.tp_comm_overlap_cfg = comm_overlap_cfg
+    cfg.comm_overlap.tp_comm_overlap = False if precision == "nvfp4" else cfg.comm_overlap.tp_comm_overlap
+
+    return cfg
+


⚠️ Potential issue | 🔴 Critical

VR200 function default variant is incompatible with available presets.

At Line 116, config_variant defaults to "v1", but VR200 presets for this model are introduced as V2-only. Default calls will fail workload-base lookup.

💡 Proposed fix

def llama31_405b_pretrain_config_vr200( - precision: str = "bf16", mock: bool = True, config_variant: str = "v1" + precision: str = "bf16", mock: bool = True, config_variant: str = "v2" ) -> ConfigContainer:

As per coding guidelines, "Do not add arbitrary defaults for configs; be as explicit as possible."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/performance/configs/llama/llama31_llm_pretrain.py` around lines 115 - 148, The function llama31_405b_pretrain_config_vr200 currently defaults config_variant to "v1" which is incompatible with VR200 presets; change the config_variant default to "v2" (or remove the default and require callers to pass the variant) and ensure the get_workload_base_config(...) call uses that corrected value so workload-base lookup succeeds; optionally add a simple validation in llama31_405b_pretrain_config_vr200 to raise a clear error if an unsupported variant (e.g., "v1") is passed.

dingqingy-nv · 2026-04-17T22:08:26Z

/claude review

claude · 2026-04-17T22:09:56Z

LGTM

vr200 cfgs, Lm3 70b, 405b, qwen3 30b, 235b, gpt-oss, kimi

5cc5297

Signed-off-by: Malay Nagda <malayn@nvidia.com>

remove fp8 cs

e709de2

Signed-off-by: Malay Nagda <malayn@nvidia.com>

malay-nagda changed the title ~~vr200 cfgs, Lm3 70b, 405b, qwen3 30b, 235b, gpt-oss, kimi~~ VR200 cfgs, Lm3 70b, 405b, qwen3 30b, 235b, gpt-oss, kimi Apr 17, 2026

kimi k2 cfgs

828c6ea

Signed-off-by: Malay Nagda <malayn@nvidia.com>

malay-nagda requested review from dingqingy-nv and ko3n1g April 17, 2026 09:58

malay-nagda marked this pull request as ready for review April 17, 2026 09:59

copy-pr-bot bot temporarily deployed to test April 17, 2026 10:00 Inactive

coderabbitai bot reviewed Apr 17, 2026

View reviewed changes

Merge branch 'main' into malay/vr200_cfgs

1084253

copy-pr-bot bot temporarily deployed to test April 17, 2026 22:43 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VR200 cfgs, Lm3 70b, 405b, qwen3 30b, 235b, gpt-oss, kimi#3374

VR200 cfgs, Lm3 70b, 405b, qwen3 30b, 235b, gpt-oss, kimi#3374
malay-nagda wants to merge 4 commits intomainfrom
malay/vr200_cfgs

malay-nagda commented Apr 17, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Apr 17, 2026

Uh oh!

coderabbitai bot commented Apr 17, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 17, 2026

Uh oh!

coderabbitai bot Apr 17, 2026

Uh oh!

coderabbitai bot Apr 17, 2026

Uh oh!

coderabbitai bot Apr 17, 2026

Uh oh!

dingqingy-nv commented Apr 17, 2026

Uh oh!

claude bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

malay-nagda commented Apr 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

New Features

Uh oh!

copy-pr-bot bot commented Apr 17, 2026

Uh oh!

coderabbitai bot commented Apr 17, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

dingqingy-nv commented Apr 17, 2026

Uh oh!

claude bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

malay-nagda commented Apr 17, 2026 •

edited by coderabbitai bot

Loading