Skip to main content
Glossaries ensure consistent translation of key terms throughout your documents. This is essential for technical documentation, brand names, product terminology, and domain-specific jargon.

What Are Glossaries?

A glossary is a mapping of terms from the source language to their preferred translations in the target language. Tinbox automatically detects important terms during translation and adds them to the glossary for consistent use. Benefits:
  • Consistent terminology across long documents
  • Proper translation of technical terms
  • Preservation of brand names and product terms
  • Reusable terminology across multiple documents

Basic Usage

Enable Glossary

Simply add the --glossary flag to enable automatic glossary building:
tinbox translate --to es --glossary --model openai:gpt-5-2025-08-07 document.txt
Tinbox will automatically:
  1. Detect important terms as it translates
  2. Build a glossary during translation
  3. Use the glossary to maintain consistency
The glossary builds progressively as translation proceeds. Early chunks establish terms that are used consistently in later chunks.

Save Glossary

Save the glossary for reuse with other documents:
tinbox translate --to es \
  --glossary \
  --save-glossary medical_terms.json \
  --model openai:gpt-5-2025-08-07 \
  medical_document.pdf
This creates a JSON file with all detected terms and their translations.

Working with Glossary Files

Glossary File Format

Glossary files use a simple JSON format:
{
  "entries": {
    "CPU": "Processeur",
    "GPU": "Carte graphique",
    "API": "Interface de programmation",
    "database": "base de données",
    "cache": "mémoire cache"
  }
}

Using an Existing Glossary

Load a pre-existing glossary to ensure specific terms are translated consistently:
tinbox translate --to fr \
  --glossary-file company_terms.json \
  --model openai:gpt-5-2025-08-07 \
  company_document.docx
Use existing glossaries for:
  • Company-specific terminology
  • Product names and features
  • Industry-standard terms
  • Previously established translations

Extending a Glossary

Load an existing glossary and save an extended version with new terms:
tinbox translate --to de \
  --glossary-file base_terms.json \
  --save-glossary extended_terms.json \
  --model openai:gpt-5-2025-08-07 \
  document.pdf
This workflow:
  1. Loads base_terms.json as the starting glossary
  2. Detects new terms during translation
  3. Saves combined glossary to extended_terms.json

Use Cases

Technical Documentation

Maintain consistent translation of technical terms:
1

Create Initial Glossary

Translate a representative document and save the glossary:
tinbox translate --to es \
  --glossary \
  --save-glossary tech_terms.json \
  --model openai:gpt-5-2025-08-07 \
  api_documentation.md
2

Use Across All Docs

Apply the glossary to all related documents:
tinbox translate --to es \
  --glossary-file tech_terms.json \
  --model openai:gpt-5-2025-08-07 \
  user_guide.md

tinbox translate --to es \
  --glossary-file tech_terms.json \
  --model openai:gpt-5-2025-08-07 \
  developer_guide.md
3

Update and Expand

Extend the glossary with new terms:
tinbox translate --to es \
  --glossary-file tech_terms.json \
  --save-glossary tech_terms_v2.json \
  --model openai:gpt-5-2025-08-07 \
  advanced_topics.md

Brand Consistency

Ensure brand names and product terms are translated correctly:
{
  "entries": {
    "Tinbox": "Tinbox",
    "CloudSync": "CloudSync",
    "DataVault": "DataVault",
    "user dashboard": "tableau de bord utilisateur",
    "premium tier": "niveau premium"
  }
}
tinbox translate --to fr \
  --glossary-file brand_terms.json \
  --model openai:gpt-5-2025-08-07 \
  marketing_materials.docx

Medical/Legal Documents

Maintain precise terminology for specialized fields:
# Medical documents
tinbox translate --to es \
  --glossary-file medical_glossary.json \
  --save-glossary medical_glossary_extended.json \
  --model openai:gpt-5-2025-08-07 \
  patient_guide.pdf

# Legal documents
tinbox translate --to de \
  --glossary-file legal_terms.json \
  --model openai:gpt-5-2025-08-07 \
  contract.docx

Combining with Checkpoints

Glossaries work seamlessly with checkpoint functionality:
tinbox translate --to es \
  --checkpoint-dir ./checkpoints \
  --glossary \
  --save-glossary book_terms.json \
  --model openai:gpt-5-2025-08-07 \
  large_book.txt
What happens:
  1. Glossary builds progressively as translation proceeds
  2. Checkpoints save both translation progress AND glossary state
  3. If interrupted and resumed, glossary state is restored
  4. Final glossary is saved when translation completes
When resuming from a checkpoint, the glossary state is automatically restored, ensuring terminology remains consistent even across interruptions.

Advanced Workflows

Multi-Document Translation Project

# Translate first document and create glossary
tinbox translate --to fr \
  --glossary \
  --save-glossary project_terms.json \
  --model openai:gpt-5-2025-08-07 \
  document_01.pdf

Manual Glossary Creation

You can create glossaries manually for precise control:
{
  "entries": {
    "machine learning": "apprentissage automatique",
    "neural network": "réseau de neurones",
    "deep learning": "apprentissage profond",
    "training data": "données d'entraînement",
    "model": "modèle",
    "inference": "inférence",
    "hyperparameter": "hyperparamètre"
  }
}
Save this as ml_terms.json and use it:
tinbox translate --to fr \
  --glossary-file ml_terms.json \
  --model openai:gpt-5-2025-08-07 \
  ml_paper.pdf

Best Practices

When to Use Glossaries:
  • Technical documentation with specialized terminology
  • Long documents requiring consistency (books, manuals)
  • Multiple related documents (documentation sets)
  • Brand-sensitive content (marketing, product docs)
  • Regulated content (medical, legal, financial)
When to Skip Glossaries:
  • Short, simple documents
  • One-time translations
  • Creative content where variety is preferred
  • Documents with no specialized terminology
ScenarioRecommendation
Single documentUse --glossary to build consistency within the doc
Related documentsUse --glossary-file with shared glossary
New projectStart with --glossary --save-glossary
Existing glossaryUse --glossary-file and optionally --save-glossary to extend
Large documentsCombine with --checkpoint-dir

Glossary Management Tips

Organize by Domain

Keep separate glossaries for different domains:
glossaries/
├── medical_terms.json
├── legal_terms.json
├── tech_terms.json
├── marketing_terms.json
└── brand_terms.json

Version Control

Track glossary changes with version control:
git add glossaries/tech_terms.json
git commit -m "Add new API terminology to tech glossary"

Share Across Teams

Store glossaries in a shared location for team access:
# Use shared glossary
tinbox translate --to es \
  --glossary-file /shared/glossaries/company_terms.json \
  --model openai:gpt-5-2025-08-07 \
  document.pdf

Troubleshooting

Glossary not being applied
  • Verify the glossary file path is correct
  • Check JSON format is valid
  • Ensure terms match exactly (case-sensitive)
Terms not being detected automatically
  • Glossary detection improves with model quality
  • Consider manually creating glossary for critical terms
  • Use higher reasoning effort: --reasoning-effort high
Glossary file errors
  • Validate JSON syntax (use a JSON validator)
  • Ensure “entries” key exists
  • Check for proper UTF-8 encoding

Next Steps

Large Documents

Combine glossaries with large document handling

Checkpoints & Resume

Use glossaries with checkpoint functionality

Translating PDFs

Apply glossaries to PDF translations

CLI Reference

Complete command-line reference

Build docs developers (and LLMs) love