Overview
Thetinbox translate command translates documents using Large Language Models (LLMs). It supports multiple file formats (PDF, TXT, DOCX, MD), translation algorithms, and cloud or local models.
Basic Usage
Arguments
The input file to translate. Must be an existing file in a supported format (PDF, TXT, DOCX, MD).Example:
./examples/elara_story.txtOptions
Core Translation Options
Model specification in the format
provider:model-name.Supported providers:openai- OpenAI models (requiresOPENAI_API_KEY)anthropic- Anthropic Claude models (requiresANTHROPIC_API_KEY)google- Google Gemini models (requiresGOOGLE_API_KEY)ollama- Local Ollama models (no API key required)
openai:gpt-4oanthropic:claude-3-sonnetgoogle:gemini-1.5-proollama:mistral-small
-mTarget language code (ISO 639-1 format).Examples:
en, es, fr, de, zh, ja, koDefault: en (English)Alias: -tSource language code (ISO 639-1 format). If not specified, the language is auto-detected.Examples:
en, es, fr, de, zh, jaDefault: Auto-detectAlias: -fThe output file path. If not specified, prints the translation to stdout.Example:
--output translated.txtAlias: -oOutput format for the translation result.Options:
text- Plain text output (default)json- JSON format with metadata and statisticsmarkdown- Markdown formatted output
textAlias: -FAlgorithm & Processing Options
Translation algorithm to use. Auto-selects based on file type if not specified.Options:
page- Process document page-by-page (required for PDFs)sliding-window- Use overlapping windows for contextcontext-aware- Smart chunking based on content structure
- PDF files →
pagealgorithm - Text files →
context-awarealgorithm
page algorithm.Alias: -aTarget chunk size in characters for the context-aware algorithm.Default:
2000Custom token to split text on when using the context-aware algorithm.Example:
--split-token "\n\n" (split on double newlines)DPI (dots per inch) for PDF rasterization. Higher values produce better quality but consume more tokens and increase cost.PDF files only.Recommended values:
150- Low quality, faster, cheaper200- Balanced (default)300- High quality, slower, more expensive
200Cost & Safety Options
Estimate cost and tokens without performing the actual translation.Shows:
- Estimated tokens
- Estimated cost (USD)
- Estimated time
- Cost level (low/medium/high)
- Warnings
falseMaximum cost threshold in USD. The translation will abort if the estimated cost exceeds this value.Example:
--max-cost 5.00Skip warning confirmations and proceed with translation automatically.Default:
falseCheckpoint & Resume Options
Directory to store translation checkpoints. Enables resuming interrupted translations.Example:
--checkpoint-dir ./checkpointsSave checkpoint every N pages or chunks.Default:
1 (save after every page/chunk)Glossary Options
Enable glossary for consistent term translations across the document.Default:
falsePath to an existing glossary file (JSON format) to load initial terms from.Example:
--glossary-file technical-terms.jsonPath to save the updated glossary after translation.Example:
--save-glossary updated-terms.jsonModel Configuration
Model reasoning effort level. Higher levels improve translation quality but significantly increase cost and processing time.Options:
minimal- Fastest, lowest cost (default)low- Slight quality improvementmedium- Balanced quality and costhigh- Best quality, highest cost and time
minimalOutput & Logging Options
Show detailed progress information during translation.Default:
falseGlobal Options
These options are available for all Tinbox commands and must be specified before the command name.Set the logging level.Options:
DEBUG, INFO, WARNING, ERROR, CRITICALDefault: INFOAlias: -lExample: tinbox --log-level DEBUG translate document.pdf --model openai:gpt-4o --to esOutput logs in JSON format.Default:
falseAlias: -jExample: tinbox --json translate document.pdf --model openai:gpt-4o --to esShow version information and exit.Alias:
-vExample: tinbox --versionExamples
Translate PDF to Spanish
Estimate Cost Before Translation
Use Local Model (Ollama)
Translation with Glossary
Resume Interrupted Translation
High-Quality PDF Translation
Set Maximum Cost Limit
JSON Output with Metadata
Translation Workflow
- Language Validation: Source and target language codes are validated
- Cost Estimation: Token count and cost are estimated
- User Confirmation: Warnings are shown if applicable (skip with
--force) - Document Loading: Input file is loaded and processed
- Translation: Content is translated using the specified algorithm
- Progress Tracking: Real-time progress bar shows completion status
- Output Generation: Translated content is written to file or stdout
- Statistics Display: Final metrics (time, tokens, cost) are shown
Error Handling
- Invalid Language Codes: Returns clear error messages for unsupported codes
- Missing API Keys: Reports which environment variable is missing
- File Type Errors: Validates file format compatibility
- Algorithm Conflicts: Prevents using incompatible algorithms (e.g., non-page algorithms with PDFs)
- Cost Limits: Aborts translation if estimated cost exceeds
--max-cost - Failed Pages: Reports which pages failed with specific error messages
See Also
- tinbox doctor - Diagnostic tool for checking setup
- Translation Algorithms - Detailed algorithm documentation
- Supported Models - Complete model provider list