Skip to content

fix(audio): lazy-load audio stages and fix tutorial notebook paths#1835

Open
mohammadaaftabv wants to merge 7 commits intoNVIDIA-NeMo:mainfrom
mohammadaaftabv:audio-tutorial-improvements
Open

fix(audio): lazy-load audio stages and fix tutorial notebook paths#1835
mohammadaaftabv wants to merge 7 commits intoNVIDIA-NeMo:mainfrom
mohammadaaftabv:audio-tutorial-improvements

Conversation

@mohammadaaftabv
Copy link
Copy Markdown
Contributor

Summary

  • Empty nemo_curator/stages/audio/__init__.py — removes eager imports of all 13 audio stages at package load time. Previously, importing any audio stage transitively pulled in every optional dependency (e.g. onnxruntime for SIGMOSFilterStage), causing ModuleNotFoundError in tutorials that only need a subset. This now follows the same pattern used by the image modality.

  • Fix tutorials/audio/readspeech/pipeline.py — the only internal consumer of the old top-level import. Updated to import AudioDataFilterStage from its specific subpackage (nemo_curator.stages.audio.advanced_pipelines).

  • Fix ALM tutorial notebook — use relative paths (../../../tests/fixtures/...) instead of relying on subprocess/git rev-parse for repo root discovery. Also corrects the API to use ALMManifestReader (composite stage) instead of ALMManifestReaderStage.

  • Fix FLEURS tutorial notebook — use simple relative path (./example_audio/fleurs) for data directory.

Test Plan

  • ALM notebook executes end-to-end via jupyter nbconvert --execute
  • FLEURS notebook executes end-to-end via jupyter nbconvert --execute
  • ALM CLI commands from README verified (default, ray_data, short-window, strict-speakers, aggressive-overlap, directory input)
  • FLEURS CLI commands from README verified (en_us, hy_am)
  • ReadSpeech pipeline verified with UTMOS+VAD filters
  • pytest tests/stages/audio/alm/ -v passes
  • No regressions — existing imports in other tutorials/scripts unaffected

…and standardized sections

- Add decision table, data availability, and system deps tables to top-level audio README
- Create shared README_TEMPLATE.md for consistent tutorial documentation
- Standardize fleurs/README.md: add pipeline flow diagram, full output schema,
  composability section, WER threshold justification, and troubleshooting table

Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
…hooting

Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
- Empty nemo_curator/stages/audio/__init__.py to prevent eager imports
  of optional deps (onnxruntime) at package load time, matching the
  image modality pattern.
- Update readspeech/pipeline.py to import from specific subpackage.
- Fix ALM notebook: use relative paths and correct ALMManifestReader API.
- Fix FLEURS notebook: use relative paths for data directory.

Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
@mohammadaaftabv mohammadaaftabv requested a review from a team as a code owner April 20, 2026 08:47
@mohammadaaftabv mohammadaaftabv requested review from ayushdg and removed request for a team April 20, 2026 08:47
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 20, 2026

Greptile Summary

This PR fixes two root causes of tutorial breakage: eager imports in nemo_curator/stages/audio/__init__.py that force all optional dependencies (including onnxruntime) to load at package import time, and hard-coded subprocess/git rev-parse calls in notebooks that fail outside the developer's local checkout. The fixes are correct and the notebook execution logs confirm end-to-end success.

  • P1 doc bug: The composability example in tutorials/audio/alm/README.md (lines 561–568) was not updated alongside the notebook; it still references ALMManifestReaderStage(manifest_paths=[...]) — the wrong inner class with a constructor that accepts no arguments.

Confidence Score: 4/5

Safe to merge after fixing the stale composability example in the ALM README.

All runtime fixes (lazy loading, import paths, notebook relative paths) are correct and verified by the embedded execution output. One P1 documentation bug remains: the composability block in README.md uses the wrong class and a non-existent constructor parameter, which will copy-paste directly to a TypeError for users.

tutorials/audio/alm/README.md (lines 561–573 composability example)

Important Files Changed

Filename Overview
nemo_curator/stages/audio/init.py Removes all eager imports (13 symbols + docstring) to enable lazy-loading, matching the image modality pattern. Breaks existing from nemo_curator.stages.audio import X usage for external consumers.
tutorials/audio/readspeech/pipeline.py Updates AudioDataFilterStage import to the specific subpackage path; all other logic unchanged. Fix is correct.
tutorials/audio/alm/alm_tutorial.ipynb Switches from subprocess/git rev-parse to relative ../../../tests/fixtures/... path; corrects ALMManifestReader API. The composability example in README.md was not updated consistently.
tutorials/audio/alm/README.md Comprehensive docs update, but the composability code block (lines 561–573) still uses ALMManifestReaderStage(manifest_paths=["data.jsonl"]) — the wrong inner class with a non-existent constructor parameter.
tutorials/audio/fleurs/fleurs_tutorial.ipynb Replaces git-based repo root discovery with os.path.abspath("./example_audio/fleurs"); all imports and pipeline logic are correct.
tutorials/audio/README.md Major expansion: adds quick-start steps, tutorial comparison table, and data-availability table. Content is accurate and matches corrected pipeline behavior.
tutorials/audio/fleurs/README.md Minor updates; CLI examples and parameter tables look accurate and consistent with the notebook.
tutorials/audio/README_TEMPLATE.md Template-only file added as a scaffold for future tutorial authors. No functional impact.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["User imports audio stage\n(e.g. ALMManifestReader)"] --> B{Before PR}
    B --> C["nemo_curator.stages.audio.__init__\neager-loads ALL 13 stages"]
    C --> D["Transitive import of onnxruntime\nSIGMOS, UTMOS, etc."]
    D --> E["ModuleNotFoundError\n(optional deps not installed)"]

    A --> F{After PR}
    F --> G["nemo_curator.stages.audio.__init__\n(empty — license header only)"]
    G --> H["User imports directly from submodule\ne.g. stages.audio.alm.alm_manifest_reader"]
    H --> I["Only that submodule's deps\nare loaded on demand"]
    I --> J["✅ No spurious import errors"]
Loading

Reviews (4): Last reviewed commit: "fix(tutorials): use absolute paths in FL..." | Re-trigger Greptile

…ovements

Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
@mohammadaaftabv
Copy link
Copy Markdown
Contributor Author

/ok to test dd0986b

Comment on lines +560 to +574
from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.audio.alm.alm_manifest_reader import ALMManifestReaderStage
from nemo_curator.stages.audio.alm.alm_data_builder import ALMDataBuilderStage
from nemo_curator.stages.audio.alm.alm_data_overlap import ALMDataOverlapStage

pipeline = Pipeline(
name="custom-alm",
stages=[
ALMManifestReaderStage(manifest_paths=["data.jsonl"]),
ALMDataBuilderStage(target_window_duration=120.0),
ALMDataOverlapStage(overlap_percentage=50),
# Add downstream stages: sharding, feature extraction, etc.
],
)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Composability example uses wrong class and non-existent parameter

ALMManifestReaderStage is the low-level inner stage that receives a FileGroupTask via its process() method — it has no constructor argument at all. Passing manifest_paths=["data.jsonl"] will cause a TypeError at instantiation, and even if it didn't, the pipeline would stall waiting for a FileGroupTask that never arrives. The user-facing composite stage (corrected in the notebook in this very PR) is ALMManifestReader, which accepts manifest_path (singular, str | list[str]).

- FLEURS: use os.path.abspath() for RAW_DATA_DIR so Ray workers resolve
  the same path as the notebook kernel.
- ALM: add RAY_MAX_LIMIT_FROM_API_SERVER env var to avoid Xenna
  monitoring hitting the Ray API server limit.

Signed-off-by: aaftaabv@gmail.com <aaftaabv@gmail.com>
@mohammadaaftabv mohammadaaftabv requested a review from karpnv April 20, 2026 10:39
@mohammadaaftabv mohammadaaftabv force-pushed the audio-tutorial-improvements branch from eb870af to 0fecc0c Compare April 20, 2026 10:57
Comment on lines +561 to +569
from nemo_curator.stages.audio.alm.alm_manifest_reader import ALMManifestReaderStage
from nemo_curator.stages.audio.alm.alm_data_builder import ALMDataBuilderStage
from nemo_curator.stages.audio.alm.alm_data_overlap import ALMDataOverlapStage

pipeline = Pipeline(
name="custom-alm",
stages=[
ALMManifestReaderStage(manifest_paths=["data.jsonl"]),
ALMDataBuilderStage(target_window_duration=120.0),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Composability example still uses wrong class and parameter

The composability snippet on line 568 instantiates ALMManifestReaderStage(manifest_paths=["data.jsonl"]). ALMManifestReaderStage is the low-level inner stage that expects a FileGroupTask via process() — it accepts no constructor arguments. The notebook was corrected in this PR to use ALMManifestReader(manifest_path=MANIFEST_PATH) (the composite, user-facing stage), but this documentation block was not updated.

Suggested change
from nemo_curator.stages.audio.alm.alm_manifest_reader import ALMManifestReaderStage
from nemo_curator.stages.audio.alm.alm_data_builder import ALMDataBuilderStage
from nemo_curator.stages.audio.alm.alm_data_overlap import ALMDataOverlapStage
pipeline = Pipeline(
name="custom-alm",
stages=[
ALMManifestReaderStage(manifest_paths=["data.jsonl"]),
ALMDataBuilderStage(target_window_duration=120.0),
from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.audio.alm.alm_manifest_reader import ALMManifestReader
from nemo_curator.stages.audio.alm.alm_data_builder import ALMDataBuilderStage
from nemo_curator.stages.audio.alm.alm_data_overlap import ALMDataOverlapStage
pipeline = Pipeline(
name="custom-alm",
stages=[
ALMManifestReader(manifest_path="data.jsonl"),
ALMDataBuilderStage(target_window_duration=120.0),
ALMDataOverlapStage(overlap_percentage=50),
# Add downstream stages: sharding, feature extraction, etc.
],
)

}
],
"source": [
"os.environ[\"RAY_MAX_LIMIT_FROM_API_SERVER\"] = \"100000\"\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be getting the error since you are setting RAY_MAX_LIMIT_FROM_API_SERVER after Curator and Ray have been imported. Maybe setting it in the very first cell of the notebook, before any other imports, might help?

I also noticed there isn't a RayClient in this notebook?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants