TranslationConfig class is an immutable Pydantic model that holds all configuration options for a translation task.
Basic Usage
Required Fields
Source language code (e.g.,
"en", "fr", "ja"). Use standard language codes.Target language code (e.g.,
"de", "es", "zh"). Use standard language codes.LLM model provider. Options:
"openai"- OpenAI models (GPT-4, etc.)"anthropic"- Anthropic models (Claude)"ollama"- Local models via Ollama"gemini"- Google’s Gemini models
Specific model name within the provider. Examples:
- OpenAI:
"gpt-4o","gpt-4-turbo","gpt-5-2025-08-07" - Anthropic:
"claude-3-sonnet","claude-3-opus","claude-3-5-sonnet" - Gemini:
"gemini-2.5-pro","gemini-1.5-flash" - Ollama:
"llama3.1","mistral-small"
Translation algorithm to use:
"page"- Translate each page independently (fast, no context)"sliding-window"- Use overlapping windows (good for continuity)"context-aware"- Smart chunking with previous/next context (best quality)
Path to the input document. Supported formats:
.pdf, .docx, .txt.Optional Fields
Path to save the translated output. If
None, output is returned but not saved.UI and Progress Settings
Whether to show detailed progress information during translation.
Callback function to update progress. Receives the number of tokens processed.
Cost Control Settings
Maximum cost threshold in USD. Translation will stop if this limit is exceeded. Must be >= 0.
Whether to skip cost and size warnings. Use with caution.
Sliding Window Algorithm Settings
Window size in characters for sliding window translation. Must be > 0.Larger windows provide more context but use more tokens per request.
Overlap size in characters between windows. Must be > 0 and <
window_size.Overlap helps maintain continuity between windows.Context-Aware Algorithm Settings
Target chunk size in characters for context-aware translation. Must be > 0.Text is split at natural boundaries (paragraphs, sentences) near this size.
Custom token to split text on for context-aware algorithm. When provided, ignores
context_size.Checkpoint Settings
Directory to store translation checkpoints. Enables resuming interrupted translations.
Save checkpoint every N pages/chunks. Must be > 0.Higher values reduce I/O overhead but increase potential re-work if interrupted.
Whether to try resuming from checkpoint if one exists.
Advanced Features
Enable glossary for consistent term translations. The model will maintain a glossary of key terms and their translations throughout the document.Adds approximately 20% token overhead but improves consistency.
Model reasoning effort level. Higher levels improve translation quality but significantly increase cost and time.
"minimal"- Fast, cost-effective (default)"low"- Slight improvement, moderate cost increase"medium"- Better quality, higher cost"high"- Best quality, much higher cost
Configuration Behavior
TranslationConfig is immutable (frozen=True). Once created, fields cannot be modified. Create a new config instance to change settings.