fix: clarify and shorten the doc custom spelling and custom vocab#120
Conversation
📝 WalkthroughWalkthroughThis PR rewrites and reorganizes docs for custom spelling and custom vocabulary: clarifies mechanisms (literal vs. phoneme-based), flattens custom_vocabulary parameters, expands examples (Gorish/Gladia), and replaces the spelling-dictionary build guide with parameter reference, tuning tips, and a recommended workflow. ChangesCustom Spelling and Vocabulary Documentation Update
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@chapters/audio-intelligence/custom-spelling.mdx`:
- Around line 13-15: Rewrite the two sentences to tighten wording and fix the
grammatical issues: change "such brand names" to "such as brand names" and
"pronunciations entries" to "pronunciation entries", and rephrase for clarity so
the first sentence reads like "Because speech-to-text models are trained on
general vocabulary, under-represented words—such as brand names, proper nouns,
and domain-specific terms—are often transcribed incorrectly." Then simplify the
second sentence to something like "Custom Spelling is a post‑processing
operation that performs literal matching between correct words and pronunciation
entries and replaces the transcription when the match is sufficiently close."
- Around line 40-42: Fix the invalid JSON in each "Gorish" example by inserting
the missing comma between "gaureish" and "geurish" so the array reads
["ghorish", "gaurish", "gaureish", "geurish", "go rich"]; update all three
occurrences of the "Gorish" array in the document (the three snippets showing
the same key) to ensure valid, parseable JSON.
In `@chapters/audio-intelligence/custom-vocabulary.mdx`:
- Line 46: The sentence mistakenly uses "Levin" and mismatched phonemes; update
the sentence so the target term reads "Levain" everywhere (replace "Levin" with
"Levain") and adjust the phoneme examples to match the configured example above
(replace the phoneme variants 'lɛvɪn', 'le vɪn', 'ləˈvin' with the correct
phoneme forms used for "Levain" in your examples), ensuring the final phrase
says the word "Levain" is replaced with the correct word "Levain" and its
matching phoneme(s).
In `@snippets/custom-vocabulary-vs-spelling.mdx`:
- Around line 1-3: The two lead comparison sentences in
snippets/custom-vocabulary-vs-spelling.mdx are grammatically awkward and should
be rewritten for clarity: replace the first sentence with a concise explanation
that Custom spelling applies literal string matching to correct consistent
near-text matches (example: "data-science" → "Data Science") and note to include
all close variants; replace the second sentence with a concise explanation that
Custom vocabulary applies phoneme-based matching for garbled or sound-alike
output (example: "le vin" / "levine" → "Levain") and note entries should cover
different possible spellings; ensure both sentences use parallel structure and
clear imperative guidance.
In `@snippets/recommended-params/custom-vocabulary.mdx`:
- Line 6: The guidance sentence under "**Add pronunciations**" is malformed;
rewrite it to clearly instruct users to provide pronunciations (IPA) for close
spelling variants — e.g., replace "You can check and Automatic Phonemic
Transcriber (IPA) that all the different spellings are covered." with a clear
sentence referencing "Automatic Phonemic Transcriber (IPA)" that instructs
checking IPA pronunciations for each variant so all spellings are covered.
- Around line 41-44: The JSON example contains a duplicated "value": "Gladia"
key which produces invalid/ambiguous JSON; remove the redundant "value" entry so
the object only has a single "value" field and retain the "pronunciations" and
"intensity" keys (i.e., ensure the object looks like { "value": "Gladia",
"pronunciations": ["Glad","Gladio"], "intensity": 0.5 }).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 05afa612-f165-4725-9183-6cbb1b88ea61
📒 Files selected for processing (4)
chapters/audio-intelligence/custom-spelling.mdxchapters/audio-intelligence/custom-vocabulary.mdxsnippets/custom-vocabulary-vs-spelling.mdxsnippets/recommended-params/custom-vocabulary.mdx
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
chapters/audio-intelligence/custom-spelling.mdx (1)
13-15:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix grammar in the introduction.
Two grammar issues remain from a previous review:
- Line 13: "such brand names" should be "such as brand names"
- Line 15: "pronunciations entries" should be "pronunciation entries"
📝 Proposed fix
-As Speech-to-text models are trained on general vocabulary, under-represented words such brand names, proper nouns, or domain-specific terms are often transcribed incorrectly. +As Speech-to-text models are trained on general vocabulary, under-represented words such as brand names, proper nouns, or domain-specific terms are often transcribed incorrectly. -Custom Spelling is a post-processing operation that applies literal matching between the correct word and the pronunciations entries. When there is a literal match, the transcribed text is replaced with your term. +Custom Spelling is a post-processing operation that applies literal matching between the correct word and the pronunciation entries. When there is a literal match, the transcribed text is replaced with your term.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@chapters/audio-intelligence/custom-spelling.mdx` around lines 13 - 15, Update the two grammatical errors in the Custom Spelling intro: change "such brand names" to "such as brand names" and change "pronunciations entries" to "pronunciation entries" in the paragraph that starts with "As Speech-to-text models..." so the sentences read correctly and clearly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@chapters/audio-intelligence/custom-spelling.mdx`:
- Line 94: The sentence "Those strings are **case-insensitive.** and can be
multiple words (e.g. `"full stop"`)." has a stray period that breaks the
sentence flow; update the sentence in custom-spelling.mdx to remove the period
and connect the clauses (e.g., "Those strings are **case-insensitive** and can
be multiple words (e.g. `"full stop"`).") so it reads as a single coherent
sentence.
---
Duplicate comments:
In `@chapters/audio-intelligence/custom-spelling.mdx`:
- Around line 13-15: Update the two grammatical errors in the Custom Spelling
intro: change "such brand names" to "such as brand names" and change
"pronunciations entries" to "pronunciation entries" in the paragraph that starts
with "As Speech-to-text models..." so the sentences read correctly and clearly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9cd560ed-6140-4bbc-b393-0f912c5d537e
📒 Files selected for processing (4)
chapters/audio-intelligence/custom-spelling.mdxchapters/audio-intelligence/custom-vocabulary.mdxsnippets/custom-vocabulary-vs-spelling.mdxsnippets/recommended-params/custom-vocabulary.mdx
✅ Files skipped from review due to trivial changes (1)
- chapters/audio-intelligence/custom-vocabulary.mdx
Clarifies and shortens the custom spelling and custom vocabulary docs so the choice between them is obvious:
Custom spelling — literal string replacement after transcription (wrong spelling, punctuation).
Custom vocabulary — phoneme-based replacement when output is garbled or sound-alike.
Both chapter pages now share the same structure (intro → how it works → when to use the other → examples → parameters → tuning → workflow). The comparison snippet is a short table instead of a long one. Recommended-params examples and pronunciation guidance are updated.
Summary by CodeRabbit