Skip to content

fix: clarify and shorten the doc custom spelling and custom vocab#120

Merged
karamouche merged 3 commits into
mainfrom
fix/custom-vocab-fix
May 22, 2026
Merged

fix: clarify and shorten the doc custom spelling and custom vocab#120
karamouche merged 3 commits into
mainfrom
fix/custom-vocab-fix

Conversation

@egenthon-cmd
Copy link
Copy Markdown
Contributor

@egenthon-cmd egenthon-cmd commented May 18, 2026

Clarifies and shortens the custom spelling and custom vocabulary docs so the choice between them is obvious:

Custom spelling — literal string replacement after transcription (wrong spelling, punctuation).
Custom vocabulary — phoneme-based replacement when output is garbled or sound-alike.
Both chapter pages now share the same structure (intro → how it works → when to use the other → examples → parameters → tuning → workflow). The comparison snippet is a short table instead of a long one. Recommended-params examples and pronunciation guidance are updated.

Summary by CodeRabbit

  • Documentation
    • Reworked custom spelling guidance with clearer intro, "How it works", parameter reference, tuning tips, and a recommended workflow.
    • Clarified custom vocabulary as a phoneme-based post-processing step; restructured parameter fields (value/pronunciations/intensity/default intensity) and added tuning guidance.
    • Simplified the comparison between custom spelling and custom vocabulary with updated examples and best-practice recommendations.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

📝 Walkthrough

Walkthrough

This PR rewrites and reorganizes docs for custom spelling and custom vocabulary: clarifies mechanisms (literal vs. phoneme-based), flattens custom_vocabulary parameters, expands examples (Gorish/Gladia), and replaces the spelling-dictionary build guide with parameter reference, tuning tips, and a recommended workflow.

Changes

Custom Spelling and Vocabulary Documentation Update

Layer / File(s) Summary
Custom Spelling Documentation Restructure
chapters/audio-intelligence/custom-spelling.mdx
Rewrites "How it works" and imports, adds a note routing phonetic/garbled matches to custom vocabulary, expands the Gorish variants in examples, and replaces "How to build" with Parameter reference, Tuning tips, and Recommended workflow.
Custom Vocabulary Documentation Rewrite
chapters/audio-intelligence/custom-vocabulary.mdx
Replaces opening explanation with phoneme-based post-processing description, clarifies intensity, flattens custom_vocabulary_config fields (vocabulary.value, vocabulary.pronunciations, per-entry vocabulary.intensity, vocabulary.language, default_intensity), and adds Tuning tips plus a revised Recommended workflow.
Comparison Snippet & Recommended Params
snippets/custom-vocabulary-vs-spelling.mdx, snippets/recommended-params/custom-vocabulary.mdx
Condenses comparison into a compact table (Matches on / Best for / You provide), repositions rule-of-thumb, updates best-practice wording about pronunciations, and updates Gladia pronunciations to ["Glad", "Gladio"] in Pre-recorded and Live examples.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • gladiaio/docs#19: Related updates to custom_vocabulary documentation and usage limits.
  • gladiaio/docs#48: Overlapping edits to custom spelling examples (Gorish variants) and related snippets.
  • gladiaio/docs#111: Prior restructuring of the same custom-spelling/custom-vocabulary docs and snippets.

Suggested reviewers

  • remilejeune2
  • lrossillon-gladia
  • mfernandez-gladia
  • nmorel

Poem

🐰 I hopped through docs with careful cheer,
Spelled Gorish variants far and near.
Phonemes sang Levain’s sweet name,
Pronunciations set the game.
Tuning tips now lead the ear!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: clarifying and shortening documentation for custom spelling and custom vocabulary features.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/custom-vocab-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mintlify
Copy link
Copy Markdown
Contributor

mintlify Bot commented May 18, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
gladia 🟢 Ready View Preview May 18, 2026, 8:06 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@chapters/audio-intelligence/custom-spelling.mdx`:
- Around line 13-15: Rewrite the two sentences to tighten wording and fix the
grammatical issues: change "such brand names" to "such as brand names" and
"pronunciations entries" to "pronunciation entries", and rephrase for clarity so
the first sentence reads like "Because speech-to-text models are trained on
general vocabulary, under-represented words—such as brand names, proper nouns,
and domain-specific terms—are often transcribed incorrectly." Then simplify the
second sentence to something like "Custom Spelling is a post‑processing
operation that performs literal matching between correct words and pronunciation
entries and replaces the transcription when the match is sufficiently close."
- Around line 40-42: Fix the invalid JSON in each "Gorish" example by inserting
the missing comma between "gaureish" and "geurish" so the array reads
["ghorish", "gaurish", "gaureish", "geurish", "go rich"]; update all three
occurrences of the "Gorish" array in the document (the three snippets showing
the same key) to ensure valid, parseable JSON.

In `@chapters/audio-intelligence/custom-vocabulary.mdx`:
- Line 46: The sentence mistakenly uses "Levin" and mismatched phonemes; update
the sentence so the target term reads "Levain" everywhere (replace "Levin" with
"Levain") and adjust the phoneme examples to match the configured example above
(replace the phoneme variants 'lɛvɪn', 'le vɪn', 'ləˈvin' with the correct
phoneme forms used for "Levain" in your examples), ensuring the final phrase
says the word "Levain" is replaced with the correct word "Levain" and its
matching phoneme(s).

In `@snippets/custom-vocabulary-vs-spelling.mdx`:
- Around line 1-3: The two lead comparison sentences in
snippets/custom-vocabulary-vs-spelling.mdx are grammatically awkward and should
be rewritten for clarity: replace the first sentence with a concise explanation
that Custom spelling applies literal string matching to correct consistent
near-text matches (example: "data-science" → "Data Science") and note to include
all close variants; replace the second sentence with a concise explanation that
Custom vocabulary applies phoneme-based matching for garbled or sound-alike
output (example: "le vin" / "levine" → "Levain") and note entries should cover
different possible spellings; ensure both sentences use parallel structure and
clear imperative guidance.

In `@snippets/recommended-params/custom-vocabulary.mdx`:
- Line 6: The guidance sentence under "**Add pronunciations**" is malformed;
rewrite it to clearly instruct users to provide pronunciations (IPA) for close
spelling variants — e.g., replace "You can check and Automatic Phonemic
Transcriber (IPA) that all the different spellings are covered." with a clear
sentence referencing "Automatic Phonemic Transcriber (IPA)" that instructs
checking IPA pronunciations for each variant so all spellings are covered.
- Around line 41-44: The JSON example contains a duplicated "value": "Gladia"
key which produces invalid/ambiguous JSON; remove the redundant "value" entry so
the object only has a single "value" field and retain the "pronunciations" and
"intensity" keys (i.e., ensure the object looks like { "value": "Gladia",
"pronunciations": ["Glad","Gladio"], "intensity": 0.5 }).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 05afa612-f165-4725-9183-6cbb1b88ea61

📥 Commits

Reviewing files that changed from the base of the PR and between ec35f7d and 8e7ad64.

📒 Files selected for processing (4)
  • chapters/audio-intelligence/custom-spelling.mdx
  • chapters/audio-intelligence/custom-vocabulary.mdx
  • snippets/custom-vocabulary-vs-spelling.mdx
  • snippets/recommended-params/custom-vocabulary.mdx

Comment thread chapters/audio-intelligence/custom-spelling.mdx Outdated
Comment thread chapters/audio-intelligence/custom-spelling.mdx
Comment thread chapters/audio-intelligence/custom-vocabulary.mdx Outdated
Comment thread snippets/custom-vocabulary-vs-spelling.mdx Outdated
Comment thread snippets/recommended-params/custom-vocabulary.mdx Outdated
Comment thread snippets/recommended-params/custom-vocabulary.mdx Outdated
Comment thread chapters/audio-intelligence/custom-vocabulary.mdx Outdated
Comment thread chapters/audio-intelligence/custom-vocabulary.mdx Outdated
Comment thread chapters/audio-intelligence/custom-vocabulary.mdx
Comment thread chapters/audio-intelligence/custom-spelling.mdx Outdated
Comment thread chapters/audio-intelligence/custom-spelling.mdx
Comment thread chapters/audio-intelligence/custom-vocabulary.mdx
Comment thread snippets/custom-vocabulary-vs-spelling.mdx
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
chapters/audio-intelligence/custom-spelling.mdx (1)

13-15: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix grammar in the introduction.

Two grammar issues remain from a previous review:

  • Line 13: "such brand names" should be "such as brand names"
  • Line 15: "pronunciations entries" should be "pronunciation entries"
📝 Proposed fix
-As Speech-to-text models are trained on general vocabulary, under-represented words such brand names, proper nouns, or domain-specific terms are often transcribed incorrectly.
+As Speech-to-text models are trained on general vocabulary, under-represented words such as brand names, proper nouns, or domain-specific terms are often transcribed incorrectly.

-Custom Spelling is a post-processing operation that applies literal matching between the correct word and the pronunciations entries. When there is a literal match, the transcribed text is replaced with your term.
+Custom Spelling is a post-processing operation that applies literal matching between the correct word and the pronunciation entries. When there is a literal match, the transcribed text is replaced with your term.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@chapters/audio-intelligence/custom-spelling.mdx` around lines 13 - 15, Update
the two grammatical errors in the Custom Spelling intro: change "such brand
names" to "such as brand names" and change "pronunciations entries" to
"pronunciation entries" in the paragraph that starts with "As Speech-to-text
models..." so the sentences read correctly and clearly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@chapters/audio-intelligence/custom-spelling.mdx`:
- Line 94: The sentence "Those strings are **case-insensitive.** and can be
multiple words (e.g. `"full stop"`)." has a stray period that breaks the
sentence flow; update the sentence in custom-spelling.mdx to remove the period
and connect the clauses (e.g., "Those strings are **case-insensitive** and can
be multiple words (e.g. `"full stop"`).") so it reads as a single coherent
sentence.

---

Duplicate comments:
In `@chapters/audio-intelligence/custom-spelling.mdx`:
- Around line 13-15: Update the two grammatical errors in the Custom Spelling
intro: change "such brand names" to "such as brand names" and change
"pronunciations entries" to "pronunciation entries" in the paragraph that starts
with "As Speech-to-text models..." so the sentences read correctly and clearly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9cd560ed-6140-4bbc-b393-0f912c5d537e

📥 Commits

Reviewing files that changed from the base of the PR and between 8e7ad64 and 68b9038.

📒 Files selected for processing (4)
  • chapters/audio-intelligence/custom-spelling.mdx
  • chapters/audio-intelligence/custom-vocabulary.mdx
  • snippets/custom-vocabulary-vs-spelling.mdx
  • snippets/recommended-params/custom-vocabulary.mdx
✅ Files skipped from review due to trivial changes (1)
  • chapters/audio-intelligence/custom-vocabulary.mdx

Comment thread chapters/audio-intelligence/custom-spelling.mdx
@karamouche karamouche self-requested a review May 19, 2026 13:44
Copy link
Copy Markdown
Contributor

@karamouche karamouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wd!

@karamouche karamouche merged commit f290e0c into main May 22, 2026
8 checks passed
@karamouche karamouche deleted the fix/custom-vocab-fix branch May 22, 2026 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants