Skip to main content
Pre-recorded Live Custom spelling lets you force specific text replacements in the final transcript. You provide a dictionary where each key is the spelling you want, and each value is a list of variants the model might produce instead. When any of those variants appear in the transcript, they get replaced with no ambiguity and no false positives. This is a straightforward text-based find-and-replace on the transcript output. It doesn’t look at how words sound; it matches the exact strings you list.

How it works

After the transcription is generated, Gladia scans the output text for any of the variant strings you listed as dictionary values. When it finds one, it swaps it for the corresponding key. The matching is deterministic: if the text is there, it gets replaced; if it isn’t, nothing happens. Keys (replacement to write) are case sensitive. Values (variants to find) are not. Values can contain multiple words.

When to use custom spelling vs. custom vocabulary

Use custom spelling when the transcription already recognizes the word but writes it differently than you want. Common cases:
  • A person’s name comes through as “Gaurish” or “Gaureish” but you need “Gorish”.
  • The model writes “data-science” and you want “Data Science”.
  • You want to replace filler words or normalize punctuation (e.g. “period” → ”.”).
Use custom vocabulary when the word comes out completely garbled or replaced by something phonetically similar. The transcription engine has never seen it and can’t get close on its own. Custom vocabulary uses phoneme matching to catch these cases, but it’s probabilistic and can produce false positives.
Custom vocabularyCustom spelling
What it doesListens to how a word sounds and replaces phonetically similar words in the transcriptFinds exact text strings in the transcript and replaces them with your preferred spelling
MechanismPhoneme-based similarity matching (probabilistic)Text-based find-and-replace (deterministic)
Best forWords that are consistently mis-transcribed: unusual proper nouns, new product names, niche jargonWords that are recognizable but misspelled, e.g. “Gaurish” → “Gorish”, “data-science” → “Data Science”
RiskCan produce false positives. Unrelated words that happen to sound similar may get replacedNo false positives, but it won’t help if the word isn’t recognized at all
TuningAdjust intensity and default_intensity to control aggressivenessNone needed. It either matches the text or it doesn’t
Rule of thumb: start with a transcription run without any custom vocabulary. Look at what the output actually says. If the word appears but is just misspelled, custom spelling is the simpler and safer fix. If the word is completely garbled, that’s when custom vocabulary is the right tool.
If you’ve been using custom vocabulary and keep running into false positives for certain terms, try moving those terms to custom spelling instead. As long as the transcription produces something close enough for you to list as a variant, custom spelling will handle the rest, deterministically and without side effects. This is a common and recommended migration path.

Example configuration

{
  "custom_spelling": true,
  "custom_spelling_config": {
    "spelling_dictionary": {
      "Gorish": ["ghorish", "gaurish", "gaureish"],
      "Data Science": ["data-science", "data science"],
      ".": ["period", "full stop"],
      "SQL": ["sequel"]
    }
  }
}
In this example, any time the model outputs “ghorish”, “gaurish”, or “gaureish”, it will be replaced with “Gorish” in the final transcript. Similarly, “sequel” becomes “SQL”, and spoken words like “period” or “full stop” become a literal ”.”.

How to build your spelling dictionary

  1. Run a transcription without custom spelling and review the output.
  2. Identify words that are close but not quite right. These are your candidates.
  3. Add each correct spelling as a key, and list every variant you’ve seen the model produce as values.
  4. Run again and verify the replacements look correct.
You don’t need to guess at variants upfront. Just collect them from real transcription outputs over time and add them to your dictionary as they appear.