Tinbox supports three output formats to suit different use cases, from simple translated text to detailed reports with metadata.
Text Plain translated text, default format
JSON Structured data with metadata for programmatic use
Markdown Formatted report with translation and statistics
Format Use Case Output To Metadata Statistics Text General use, human reading File or stdout ❌ No ❌ No JSON API integration, data processing File or stdout ✅ Full ✅ Full Markdown Documentation, reports File or stdout ✅ Full ✅ Full
Text Format
The default output format containing only the translated text.
Usage
# Default format (text)
tinbox translate --to es --model openai:gpt-4o document.txt
# Explicit text format
tinbox translate --to es --format text --model openai:gpt-4o document.txt
# Save to file
tinbox translate --to es --output translated.txt --model openai:gpt-4o document.txt
Implementation
# From output.py:97-115
class TextOutputHandler :
"""Handler for plain text output format."""
def write (
self ,
output : TranslationOutput,
file : Path | None = None ,
) -> None :
# Just write the translated text
if file :
file .write_text(output.result.text, encoding = "utf-8" )
else :
print (output.result.text)
Example Output
Esta es la traducción de su documento. El texto aparece exactamente
como fue traducido por el modelo, sin metadatos adicionales ni formato.
Si hay páginas que fallaron, verá marcadores de posición en el texto:
[TRANSLATION_FAILED: Page 5]
Reason: API timeout
[/TRANSLATION_FAILED]
Use Cases
When you just need the translated text without metadata: tinbox translate --to fr --model openai:gpt-4o email.txt > email_fr.txt
When feeding output to another tool: tinbox translate --to es --model openai:gpt-4o input.txt | wc -w
When the translation is for direct human consumption: tinbox translate --to de --output article_de.txt --model openai:gpt-4o article.txt
Text format is the most versatile - it can be piped, redirected, or saved without any post-processing.
Structured output with complete metadata, perfect for programmatic use.
Usage
# JSON to stdout
tinbox translate --to es --format json --model openai:gpt-4o document.txt
# Save JSON to file
tinbox translate --to es --format json --output result.json --model openai:gpt-4o document.txt
Implementation
# From output.py:63-94
class JSONOutputHandler :
"""Handler for JSON output format."""
def write (
self ,
output : TranslationOutput,
file : Path | None = None ,
) -> None :
# Convert to JSON-serializable dict
data = output.model_dump(
mode = "json" ,
exclude_none = True ,
)
# Convert Path objects to strings
data[ "metadata" ][ "input_file" ] = str (data[ "metadata" ][ "input_file" ])
# Format JSON with indentation
json_str = json.dumps(data, indent = 2 )
if file :
file .write_text(json_str, encoding = "utf-8" )
else :
print (json_str)
Schema
interface TranslationOutput {
metadata : {
source_lang : string ; // e.g., "en"
target_lang : string ; // e.g., "es"
model : string ; // e.g., "openai"
algorithm : string ; // e.g., "page"
input_file : string ; // Path to input file
input_file_type : string ; // e.g., "pdf", "txt", "docx"
timestamp : string ; // ISO 8601 format
};
result : {
text : string ; // Translated text
tokens_used : number ; // Total tokens consumed
cost : number ; // Total cost in USD
time_taken : number ; // Time in seconds
failed_pages ?: number []; // Array of failed page numbers
page_errors ?: { // Error messages by page
[ page : number ] : string ;
};
};
warnings ?: string []; // Warning messages
errors ?: string []; // Error messages
}
Example Output
{
"metadata" : {
"source_lang" : "en" ,
"target_lang" : "es" ,
"model" : "openai" ,
"algorithm" : "page" ,
"input_file" : "/home/user/document.pdf" ,
"input_file_type" : "pdf" ,
"timestamp" : "2026-03-01T14:30:45.123456"
},
"result" : {
"text" : "Esta es la traducción completa..." ,
"tokens_used" : 12543 ,
"cost" : 0.187 ,
"time_taken" : 45.3 ,
"failed_pages" : [ 5 , 12 ],
"page_errors" : {
"5" : "API timeout after 30s" ,
"12" : "Rate limit exceeded"
}
},
"warnings" : [
"Translation incomplete: 2 page(s) failed to translate (pages: 5, 12)"
],
"errors" : []
}
Use Cases
Parse JSON output in your application: import json
import subprocess
result = subprocess.run([
"tinbox" , "translate" ,
"--to" , "es" ,
"--format" , "json" ,
"--model" , "openai:gpt-4o" ,
"document.pdf"
], capture_output = True , text = True )
data = json.loads(result.stdout)
print ( f "Cost: $ { data[ 'result' ][ 'cost' ] :.2f} " )
print ( f "Tokens: { data[ 'result' ][ 'tokens_used' ] :,} " )
Track costs across multiple translations: for doc in *.pdf ; do
tinbox translate --to es --format json --model openai:gpt-4o " $doc " > "${ doc % . pdf }.json"
done
# Analyze costs
jq '.result.cost' * .json | jq -s add
Detect and log failed pages: data = json.loads(output)
if data[ "result" ].get( "failed_pages" ):
for page, error in data[ "result" ][ "page_errors" ].items():
log.error( f "Page { page } failed: { error } " )
Use JSON format when you need to programmatically access translation statistics, costs, or handle failures.
Human-readable report with formatted sections and statistics.
Usage
# Markdown to stdout
tinbox translate --to es --format markdown --model openai:gpt-4o document.txt
# Save Markdown to file
tinbox translate --to es --format markdown --output report.md --model openai:gpt-4o document.txt
Implementation
# From output.py:118-187
class MarkdownOutputHandler :
"""Handler for Markdown output format."""
def write (
self ,
output : TranslationOutput,
file : Path | None = None ,
) -> None :
# Build Markdown content
md_lines = [
"# Translation Results \n " ,
"## Metadata" ,
f "- Source Language: { output.metadata.source_lang } " ,
f "- Target Language: { output.metadata.target_lang } " ,
f "- Model: { output.metadata.model.value } " ,
f "- Algorithm: { output.metadata.algorithm } " ,
f "- Input File: { output.metadata.input_file.name } " ,
f "- File Type: { output.metadata.input_file_type.value } " ,
f "- Timestamp: { output.metadata.timestamp.isoformat() } \n " ,
"## Translation" ,
"```text" ,
output.result.text,
"``` \n " ,
"## Statistics" ,
f "- Tokens Used: { output.result.tokens_used :,} " ,
f "- Cost: $ { output.result.cost :.4f} " ,
f "- Time Taken: { output.result.time_taken :.1f} s \n " ,
]
# Add warnings and errors sections...
Example Output
# Translation Results
## Metadata
- Source Language: en
- Target Language: es
- Model: openai
- Algorithm: page
- Input File: document.pdf
- File Type: pdf
- Timestamp: 2026-03-01T14:30:45.123456
## Translation
Esta es la traducción completa del documento.
El texto aparece formateado en un bloque de código.
## Statistics
- Tokens Used: 12,543
- Cost: $0.1870
- Time Taken: 45.3s
## Warnings
- Translation incomplete: 2 page(s) failed to translate (pages: 5, 12)
## Errors
[ None ]
Use Cases
Keep translation records with full context: tinbox translate --to es --format markdown --output translation-report.md --model openai:gpt-4o whitepaper.pdf
Share the Markdown report with stakeholders.
Maintain an archive of all translations with metadata: # Create archive directory
mkdir -p translations/ $$ ( date +%Y-%m-%d )
# Translate and save report
tinbox translate --to es \
--format markdown \
--output translations/ $$ ( date +%Y-%m-%d ) /report.md \
--model openai:gpt-4o \
document.pdf
Generate readable cost reports for accounting: for doc in *.pdf ; do
tinbox translate --to es \
--format markdown \
--output "reports/${ doc % . pdf }-es.md" \
--model openai:gpt-4o \
" $doc "
done
# Reports include timestamp, cost, and tokens for each translation
Markdown format is perfect for human-readable reports that you can commit to version control or share with teams.
All formats (JSON and Markdown) include this metadata:
# From output.py:25-34
class TranslationMetadata ( BaseModel ):
"""Metadata about the translation process."""
source_lang: str
target_lang: str
model: ModelType
algorithm: str
input_file: Path
input_file_type: FileType
timestamp: datetime = Field( default_factory = datetime.now)
Fields Explained
Source language code (ISO 639-1) Examples: "en", "zh", "es", "auto" (if auto-detected)
Target language code (ISO 639-1) Examples: "es", "fr", "de", "ja"
Model provider used for translation Values: "openai", "anthropic", "gemini", "ollama"
Translation algorithm used Values: "page", "context-aware", "sliding-window"
Path to the input file that was translated Example: "/home/user/documents/report.pdf"
Type of input file Values: "pdf", "docx", "txt"
ISO 8601 timestamp of when translation started Example: "2026-03-01T14:30:45.123456"
Translation Result Structure
# From output.py:37-43
class TranslationOutput ( BaseModel ):
"""Complete translation output including metadata."""
metadata: TranslationMetadata
result: TranslationResult
warnings: list[ str ] = Field( default_factory = list )
errors: list[ str ] = Field( default_factory = list )
Result Fields
Field Type Description textstring The complete translated text tokens_usedinteger Total tokens consumed (input + output) costfloat Total cost in USD time_takenfloat Translation time in seconds failed_pagesinteger[] Array of page numbers that failed (optional) page_errorsobject Map of page numbers to error messages (optional)
Identify Your Use Case
Human reading : Text or Markdown
Programmatic use : JSON
Documentation : Markdown
Data processing : JSON
Consider Your Workflow
Pipeline integration : Text (easiest to pipe)
API responses : JSON (structured data)
Reports and archives : Markdown (readable + metadata)
Simple translation : Text (clean output)
Select and Test
# Try each format on a small document
tinbox translate --to es --format text --model openai:gpt-4o sample.txt
tinbox translate --to es --format json --model openai:gpt-4o sample.txt
tinbox translate --to es --format markdown --model openai:gpt-4o sample.txt
Save Translation and Report Separately
# 1. Save translated text
tinbox translate --to es \
--format text \
--output translated.txt \
--model openai:gpt-4o \
document.pdf
# 2. Generate report
tinbox translate --to es \
--format markdown \
--output report.md \
--model openai:gpt-4o \
document.pdf
This will translate the document twice and charge you double. Better approach: # Translate once as JSON, then extract text separately
tinbox translate --to es --format json --model openai:gpt-4o document.pdf > result.json
jq -r '.result.text' result.json > translated.txt
Batch Translation with JSON Logging
#!/bin/bash
for doc in *.pdf ; do
echo "Translating $doc ..."
# Translate and save both outputs
tinbox translate --to es \
--format json \
--model openai:gpt-4o \
" $doc " > "logs/${ doc % . pdf }.json"
# Extract text from JSON
jq -r '.result.text' "logs/ $$ {doc%.pdf}.json" > "translated/ $$ {doc%.pdf}_es.txt"
# Log cost
cost = $$ ( jq '.result.cost' "logs/ $$ {doc%.pdf}.json" )
echo " $doc : $$ {cost}" >> cost_log.txt
done
# Calculate total cost
total = $$ ( jq -s 'map(.result.cost) | add' logs/ * .json )
echo "Total cost: $$ {total}"
Algorithm Comparison Understand which algorithm produces which output structure
Cost Optimization Monitor costs using JSON output format