Glossaries ensure consistent translation of key terms throughout your documents. This is essential for technical documentation, brand names, product terminology, and domain-specific jargon.
What Are Glossaries?
A glossary is a mapping of terms from the source language to their preferred translations in the target language. Tinbox automatically detects important terms during translation and adds them to the glossary for consistent use.
Benefits:
Consistent terminology across long documents
Proper translation of technical terms
Preservation of brand names and product terms
Reusable terminology across multiple documents
Basic Usage
Enable Glossary
Simply add the --glossary flag to enable automatic glossary building:
tinbox translate --to es --glossary --model openai:gpt-5-2025-08-07 document.txt
Tinbox will automatically:
Detect important terms as it translates
Build a glossary during translation
Use the glossary to maintain consistency
The glossary builds progressively as translation proceeds. Early chunks establish terms that are used consistently in later chunks.
Save Glossary
Save the glossary for reuse with other documents:
tinbox translate --to es \
--glossary \
--save-glossary medical_terms.json \
--model openai:gpt-5-2025-08-07 \
medical_document.pdf
This creates a JSON file with all detected terms and their translations.
Working with Glossary Files
Glossary files use a simple JSON format:
{
"entries" : {
"CPU" : "Processeur" ,
"GPU" : "Carte graphique" ,
"API" : "Interface de programmation" ,
"database" : "base de données" ,
"cache" : "mémoire cache"
}
}
Using an Existing Glossary
Load a pre-existing glossary to ensure specific terms are translated consistently:
tinbox translate --to fr \
--glossary-file company_terms.json \
--model openai:gpt-5-2025-08-07 \
company_document.docx
Use existing glossaries for:
Company-specific terminology
Product names and features
Industry-standard terms
Previously established translations
Extending a Glossary
Load an existing glossary and save an extended version with new terms:
tinbox translate --to de \
--glossary-file base_terms.json \
--save-glossary extended_terms.json \
--model openai:gpt-5-2025-08-07 \
document.pdf
This workflow:
Loads base_terms.json as the starting glossary
Detects new terms during translation
Saves combined glossary to extended_terms.json
Use Cases
Technical Documentation
Maintain consistent translation of technical terms:
Create Initial Glossary
Translate a representative document and save the glossary: tinbox translate --to es \
--glossary \
--save-glossary tech_terms.json \
--model openai:gpt-5-2025-08-07 \
api_documentation.md
Use Across All Docs
Apply the glossary to all related documents: tinbox translate --to es \
--glossary-file tech_terms.json \
--model openai:gpt-5-2025-08-07 \
user_guide.md
tinbox translate --to es \
--glossary-file tech_terms.json \
--model openai:gpt-5-2025-08-07 \
developer_guide.md
Update and Expand
Extend the glossary with new terms: tinbox translate --to es \
--glossary-file tech_terms.json \
--save-glossary tech_terms_v2.json \
--model openai:gpt-5-2025-08-07 \
advanced_topics.md
Brand Consistency
Ensure brand names and product terms are translated correctly:
{
"entries" : {
"Tinbox" : "Tinbox" ,
"CloudSync" : "CloudSync" ,
"DataVault" : "DataVault" ,
"user dashboard" : "tableau de bord utilisateur" ,
"premium tier" : "niveau premium"
}
}
tinbox translate --to fr \
--glossary-file brand_terms.json \
--model openai:gpt-5-2025-08-07 \
marketing_materials.docx
Medical/Legal Documents
Maintain precise terminology for specialized fields:
# Medical documents
tinbox translate --to es \
--glossary-file medical_glossary.json \
--save-glossary medical_glossary_extended.json \
--model openai:gpt-5-2025-08-07 \
patient_guide.pdf
# Legal documents
tinbox translate --to de \
--glossary-file legal_terms.json \
--model openai:gpt-5-2025-08-07 \
contract.docx
Combining with Checkpoints
Glossaries work seamlessly with checkpoint functionality:
tinbox translate --to es \
--checkpoint-dir ./checkpoints \
--glossary \
--save-glossary book_terms.json \
--model openai:gpt-5-2025-08-07 \
large_book.txt
What happens:
Glossary builds progressively as translation proceeds
Checkpoints save both translation progress AND glossary state
If interrupted and resumed, glossary state is restored
Final glossary is saved when translation completes
When resuming from a checkpoint, the glossary state is automatically restored, ensuring terminology remains consistent even across interruptions.
Advanced Workflows
Multi-Document Translation Project
Step 1: Establish Glossary
Step 2: Apply to All Docs
Step 3: Update Glossary
# Translate first document and create glossary
tinbox translate --to fr \
--glossary \
--save-glossary project_terms.json \
--model openai:gpt-5-2025-08-07 \
document_01.pdf
Manual Glossary Creation
You can create glossaries manually for precise control:
{
"entries" : {
"machine learning" : "apprentissage automatique" ,
"neural network" : "réseau de neurones" ,
"deep learning" : "apprentissage profond" ,
"training data" : "données d'entraînement" ,
"model" : "modèle" ,
"inference" : "inférence" ,
"hyperparameter" : "hyperparamètre"
}
}
Save this as ml_terms.json and use it:
tinbox translate --to fr \
--glossary-file ml_terms.json \
--model openai:gpt-5-2025-08-07 \
ml_paper.pdf
Best Practices
When to Use Glossaries:
Technical documentation with specialized terminology
Long documents requiring consistency (books, manuals)
Multiple related documents (documentation sets)
Brand-sensitive content (marketing, product docs)
Regulated content (medical, legal, financial)
When to Skip Glossaries:
Short, simple documents
One-time translations
Creative content where variety is preferred
Documents with no specialized terminology
Scenario Recommendation Single document Use --glossary to build consistency within the doc Related documents Use --glossary-file with shared glossary New project Start with --glossary --save-glossary Existing glossary Use --glossary-file and optionally --save-glossary to extend Large documents Combine with --checkpoint-dir
Glossary Management Tips
Organize by Domain
Keep separate glossaries for different domains:
glossaries/
├── medical_terms.json
├── legal_terms.json
├── tech_terms.json
├── marketing_terms.json
└── brand_terms.json
Version Control
Track glossary changes with version control:
git add glossaries/tech_terms.json
git commit -m "Add new API terminology to tech glossary"
Share Across Teams
Store glossaries in a shared location for team access:
# Use shared glossary
tinbox translate --to es \
--glossary-file /shared/glossaries/company_terms.json \
--model openai:gpt-5-2025-08-07 \
document.pdf
Troubleshooting
Glossary not being applied
Verify the glossary file path is correct
Check JSON format is valid
Ensure terms match exactly (case-sensitive)
Terms not being detected automatically
Glossary detection improves with model quality
Consider manually creating glossary for critical terms
Use higher reasoning effort: --reasoning-effort high
Glossary file errors
Validate JSON syntax (use a JSON validator)
Ensure “entries” key exists
Check for proper UTF-8 encoding
Next Steps
Large Documents Combine glossaries with large document handling
Checkpoints & Resume Use glossaries with checkpoint functionality
Translating PDFs Apply glossaries to PDF translations
CLI Reference Complete command-line reference