tinbox translate

Overview

The tinbox translate command translates documents using Large Language Models (LLMs). It supports multiple file formats (PDF, TXT, DOCX, MD), translation algorithms, and cloud or local models.

Basic Usage

tinbox translate <input-file> --model <provider:model> --to <target-lang>

tinbox translate document.pdf --model openai:gpt-4o --to es

Arguments

input-file

Path

required

The input file to translate. Must be an existing file in a supported format (PDF, TXT, DOCX, MD).Example: ./examples/elara_story.txt

Options

Core Translation Options

--model

string

required

Model specification in the format provider:model-name.Supported providers:

openai - OpenAI models (requires OPENAI_API_KEY)
anthropic - Anthropic Claude models (requires ANTHROPIC_API_KEY)
google - Google Gemini models (requires GOOGLE_API_KEY)
ollama - Local Ollama models (no API key required)

Examples:

openai:gpt-4o
anthropic:claude-3-sonnet
google:gemini-1.5-pro
ollama:mistral-small

Alias: -m

--to

string

default:"en"

Target language code (ISO 639-1 format).Examples: en, es, fr, de, zh, ja, koDefault: en (English)Alias: -t

--from

string

default:"auto"

Source language code (ISO 639-1 format). If not specified, the language is auto-detected.Examples: en, es, fr, de, zh, jaDefault: Auto-detectAlias: -f

--output

Path

The output file path. If not specified, prints the translation to stdout.Example: --output translated.txtAlias: -o

--format

enum

default:"text"

Output format for the translation result.Options:

text - Plain text output (default)
json - JSON format with metadata and statistics
markdown - Markdown formatted output

Default: textAlias: -F

Algorithm & Processing Options

--algorithm

string

default:"auto"

Translation algorithm to use. Auto-selects based on file type if not specified.Options:

page - Process document page-by-page (required for PDFs)
sliding-window - Use overlapping windows for context
context-aware - Smart chunking based on content structure

Auto-selection:

PDF files → page algorithm
Text files → context-aware algorithm

Note: PDF files only support the page algorithm.Alias: -a

--context-size

integer

default:"2000"

Target chunk size in characters for the context-aware algorithm.Default: 2000

--split-token

string

Custom token to split text on when using the context-aware algorithm.Example: --split-token "\n\n" (split on double newlines)

--pdf-dpi

integer

default:"200"

DPI (dots per inch) for PDF rasterization. Higher values produce better quality but consume more tokens and increase cost.PDF files only.Recommended values:

150 - Low quality, faster, cheaper
200 - Balanced (default)
300 - High quality, slower, more expensive

Default: 200

Cost & Safety Options

--dry-run

boolean

default:"false"

Estimate cost and tokens without performing the actual translation.Shows:

Estimated tokens
Estimated cost (USD)
Estimated time
Cost level (low/medium/high)
Warnings

Default: false

--max-cost

float

Maximum cost threshold in USD. The translation will abort if the estimated cost exceeds this value.Example: --max-cost 5.00

--force

boolean

default:"false"

Skip warning confirmations and proceed with translation automatically.Default: false

Checkpoint & Resume Options

--checkpoint-dir

Path

Directory to store translation checkpoints. Enables resuming interrupted translations.Example: --checkpoint-dir ./checkpoints

--checkpoint-frequency

integer

default:"1"

Save checkpoint every N pages or chunks.Default: 1 (save after every page/chunk)

Glossary Options

--glossary

boolean

default:"false"

Enable glossary for consistent term translations across the document.Default: false

--glossary-file

Path

Path to an existing glossary file (JSON format) to load initial terms from.Example: --glossary-file technical-terms.json

--save-glossary

Path

Path to save the updated glossary after translation.Example: --save-glossary updated-terms.json

Model Configuration

--reasoning-effort

enum

default:"minimal"

Model reasoning effort level. Higher levels improve translation quality but significantly increase cost and processing time.Options:

minimal - Fastest, lowest cost (default)
low - Slight quality improvement
medium - Balanced quality and cost
high - Best quality, highest cost and time

Default: minimal

Output & Logging Options

--verbose

boolean

default:"false"

Show detailed progress information during translation.Default: false

Global Options

These options are available for all Tinbox commands and must be specified before the command name.

--log-level

string

default:"INFO"

Set the logging level.Options: DEBUG, INFO, WARNING, ERROR, CRITICALDefault: INFOAlias: -lExample: tinbox --log-level DEBUG translate document.pdf --model openai:gpt-4o --to es

--json

boolean

default:"false"

Output logs in JSON format.Default: falseAlias: -jExample: tinbox --json translate document.pdf --model openai:gpt-4o --to es

--version

boolean

Show version information and exit.Alias: -vExample: tinbox --version

Examples

Translate PDF to Spanish

tinbox translate document.pdf --model openai:gpt-4o --to es --output document_es.txt

Estimate Cost Before Translation

tinbox translate large-book.pdf --model anthropic:claude-3-sonnet --to fr --dry-run

Use Local Model (Ollama)

tinbox translate story.txt --model ollama:mistral-small --to de --output story_de.txt

Translation with Glossary

tinbox translate technical-manual.md \
  --model openai:gpt-4o \
  --to ja \
  --glossary \
  --glossary-file existing-terms.json \
  --save-glossary updated-terms.json

Resume Interrupted Translation

# First run (gets interrupted)
tinbox translate long-document.pdf \
  --model anthropic:claude-3-sonnet \
  --to zh \
  --checkpoint-dir ./checkpoints

# Resume from checkpoint
tinbox translate long-document.pdf \
  --model anthropic:claude-3-sonnet \
  --to zh \
  --checkpoint-dir ./checkpoints

High-Quality PDF Translation

tinbox translate scanned-document.pdf \
  --model openai:gpt-4o \
  --to es \
  --pdf-dpi 300 \
  --reasoning-effort medium

Set Maximum Cost Limit

tinbox translate document.pdf \
  --model openai:gpt-4o \
  --to fr \
  --max-cost 10.00

JSON Output with Metadata

tinbox translate report.txt \
  --model anthropic:claude-3-sonnet \
  --to de \
  --format json \
  --output report_de.json

Translation Workflow

Language Validation: Source and target language codes are validated
Cost Estimation: Token count and cost are estimated
User Confirmation: Warnings are shown if applicable (skip with --force)
Document Loading: Input file is loaded and processed
Translation: Content is translated using the specified algorithm
Progress Tracking: Real-time progress bar shows completion status
Output Generation: Translated content is written to file or stdout
Statistics Display: Final metrics (time, tokens, cost) are shown

Error Handling

Invalid Language Codes: Returns clear error messages for unsupported codes
Missing API Keys: Reports which environment variable is missing
File Type Errors: Validates file format compatibility
Algorithm Conflicts: Prevents using incompatible algorithms (e.g., non-page algorithms with PDFs)
Cost Limits: Aborts translation if estimated cost exceeds --max-cost
Failed Pages: Reports which pages failed with specific error messages

CLI Commands

Python API

tinbox translate

Overview

Basic Usage

Arguments

Options

Core Translation Options

Algorithm & Processing Options

Cost & Safety Options

Checkpoint & Resume Options

Glossary Options

Model Configuration

Output & Logging Options

Global Options

Examples

Translate PDF to Spanish

Estimate Cost Before Translation

Use Local Model (Ollama)

Translation with Glossary

Resume Interrupted Translation

High-Quality PDF Translation

Set Maximum Cost Limit

JSON Output with Metadata

Translation Workflow

Error Handling

See Also

Build docs developers (and LLMs) love

CLI Commands

Python API

​Overview

​Basic Usage

​Arguments

​Options

​Core Translation Options

​Algorithm & Processing Options

​Cost & Safety Options

​Checkpoint & Resume Options

​Glossary Options

​Model Configuration

​Output & Logging Options

​Global Options

​Examples

​Translate PDF to Spanish

​Estimate Cost Before Translation

​Use Local Model (Ollama)

​Translation with Glossary

​Resume Interrupted Translation

​High-Quality PDF Translation

​Set Maximum Cost Limit

​JSON Output with Metadata

​Translation Workflow

​Error Handling

​See Also

Build docs developers (and LLMs) love

Overview

Basic Usage

Arguments

Options

Core Translation Options

Algorithm & Processing Options

Cost & Safety Options

Checkpoint & Resume Options

Glossary Options

Model Configuration

Output & Logging Options

Global Options

Examples

Translate PDF to Spanish

Estimate Cost Before Translation

Use Local Model (Ollama)

Translation with Glossary

Resume Interrupted Translation

High-Quality PDF Translation

Set Maximum Cost Limit

JSON Output with Metadata

Translation Workflow

Error Handling

See Also