Output Formats

Tinbox supports three output formats to suit different use cases, from simple translated text to detailed reports with metadata.

Available Formats

Text

Plain translated text, default format

JSON

Structured data with metadata for programmatic use

Markdown

Formatted report with translation and statistics

Format Comparison

Format	Use Case	Output To	Metadata	Statistics
Text	General use, human reading	File or stdout	❌ No	❌ No
JSON	API integration, data processing	File or stdout	✅ Full	✅ Full
Markdown	Documentation, reports	File or stdout	✅ Full	✅ Full

Text Format

The default output format containing only the translated text.

Usage

# Default format (text)
tinbox translate --to es --model openai:gpt-4o document.txt

# Explicit text format
tinbox translate --to es --format text --model openai:gpt-4o document.txt

# Save to file
tinbox translate --to es --output translated.txt --model openai:gpt-4o document.txt

Implementation

# From output.py:97-115
class TextOutputHandler:
    """Handler for plain text output format."""

    def write(
        self,
        output: TranslationOutput,
        file: Path | None = None,
    ) -> None:
        # Just write the translated text
        if file:
            file.write_text(output.result.text, encoding="utf-8")
        else:
            print(output.result.text)

Example Output

Esta es la traducción de su documento. El texto aparece exactamente
como fue traducido por el modelo, sin metadatos adicionales ni formato.

Si hay páginas que fallaron, verá marcadores de posición en el texto:
[TRANSLATION_FAILED: Page 5]
Reason: API timeout
[/TRANSLATION_FAILED]

Use Cases

Quick Translations

When you just need the translated text without metadata:

tinbox translate --to fr --model openai:gpt-4o email.txt > email_fr.txt

Pipeline Integration

When feeding output to another tool:

tinbox translate --to es --model openai:gpt-4o input.txt | wc -w

Human Reading

When the translation is for direct human consumption:

tinbox translate --to de --output article_de.txt --model openai:gpt-4o article.txt

Text format is the most versatile - it can be piped, redirected, or saved without any post-processing.

JSON Format

Structured output with complete metadata, perfect for programmatic use.

Usage

# JSON to stdout
tinbox translate --to es --format json --model openai:gpt-4o document.txt

# Save JSON to file
tinbox translate --to es --format json --output result.json --model openai:gpt-4o document.txt

Implementation

# From output.py:63-94
class JSONOutputHandler:
    """Handler for JSON output format."""

    def write(
        self,
        output: TranslationOutput,
        file: Path | None = None,
    ) -> None:
        # Convert to JSON-serializable dict
        data = output.model_dump(
            mode="json",
            exclude_none=True,
        )

        # Convert Path objects to strings
        data["metadata"]["input_file"] = str(data["metadata"]["input_file"])
        
        # Format JSON with indentation
        json_str = json.dumps(data, indent=2)
        
        if file:
            file.write_text(json_str, encoding="utf-8")
        else:
            print(json_str)

Schema

interface TranslationOutput {
  metadata: {
    source_lang: string;        // e.g., "en"
    target_lang: string;        // e.g., "es"
    model: string;              // e.g., "openai"
    algorithm: string;          // e.g., "page"
    input_file: string;         // Path to input file
    input_file_type: string;    // e.g., "pdf", "txt", "docx"
    timestamp: string;          // ISO 8601 format
  };
  result: {
    text: string;               // Translated text
    tokens_used: number;        // Total tokens consumed
    cost: number;               // Total cost in USD
    time_taken: number;         // Time in seconds
    failed_pages?: number[];    // Array of failed page numbers
    page_errors?: {             // Error messages by page
      [page: number]: string;
    };
  };
  warnings?: string[];          // Warning messages
  errors?: string[];            // Error messages
}

Example Output

{
  "metadata": {
    "source_lang": "en",
    "target_lang": "es",
    "model": "openai",
    "algorithm": "page",
    "input_file": "/home/user/document.pdf",
    "input_file_type": "pdf",
    "timestamp": "2026-03-01T14:30:45.123456"
  },
  "result": {
    "text": "Esta es la traducción completa...",
    "tokens_used": 12543,
    "cost": 0.187,
    "time_taken": 45.3,
    "failed_pages": [5, 12],
    "page_errors": {
      "5": "API timeout after 30s",
      "12": "Rate limit exceeded"
    }
  },
  "warnings": [
    "Translation incomplete: 2 page(s) failed to translate (pages: 5, 12)"
  ],
  "errors": []
}

Use Cases

API Integration

Parse JSON output in your application:

import json
import subprocess

result = subprocess.run([
    "tinbox", "translate",
    "--to", "es",
    "--format", "json",
    "--model", "openai:gpt-4o",
    "document.pdf"
], capture_output=True, text=True)

data = json.loads(result.stdout)
print(f"Cost: ${data['result']['cost']:.2f}")
print(f"Tokens: {data['result']['tokens_used']:,}")

Batch Processing

Track costs across multiple translations:

for doc in *.pdf; do
  tinbox translate --to es --format json --model openai:gpt-4o "$doc" > "${doc%.pdf}.json"
done

# Analyze costs
jq '.result.cost' *.json | jq -s add

Quality Monitoring

Detect and log failed pages:

data = json.loads(output)
if data["result"].get("failed_pages"):
    for page, error in data["result"]["page_errors"].items():
        log.error(f"Page {page} failed: {error}")

Use JSON format when you need to programmatically access translation statistics, costs, or handle failures.

Markdown Format

Human-readable report with formatted sections and statistics.

Usage

# Markdown to stdout
tinbox translate --to es --format markdown --model openai:gpt-4o document.txt

# Save Markdown to file
tinbox translate --to es --format markdown --output report.md --model openai:gpt-4o document.txt

Implementation

# From output.py:118-187
class MarkdownOutputHandler:
    """Handler for Markdown output format."""

    def write(
        self,
        output: TranslationOutput,
        file: Path | None = None,
    ) -> None:
        # Build Markdown content
        md_lines = [
            "# Translation Results\n",
            "## Metadata",
            f"- Source Language: {output.metadata.source_lang}",
            f"- Target Language: {output.metadata.target_lang}",
            f"- Model: {output.metadata.model.value}",
            f"- Algorithm: {output.metadata.algorithm}",
            f"- Input File: {output.metadata.input_file.name}",
            f"- File Type: {output.metadata.input_file_type.value}",
            f"- Timestamp: {output.metadata.timestamp.isoformat()}\n",
            "## Translation",
            "```text",
            output.result.text,
            "```\n",
            "## Statistics",
            f"- Tokens Used: {output.result.tokens_used:,}",
            f"- Cost: ${output.result.cost:.4f}",
            f"- Time Taken: {output.result.time_taken:.1f}s\n",
        ]

        # Add warnings and errors sections...

Example Output

# Translation Results

## Metadata
- Source Language: en
- Target Language: es
- Model: openai
- Algorithm: page
- Input File: document.pdf
- File Type: pdf
- Timestamp: 2026-03-01T14:30:45.123456

## Translation
Esta es la traducción completa del documento.
El texto aparece formateado en un bloque de código.

## Statistics
- Tokens Used: 12,543
- Cost: $0.1870
- Time Taken: 45.3s

## Warnings
- Translation incomplete: 2 page(s) failed to translate (pages: 5, 12)

## Errors
[None]

Use Cases

Documentation

Keep translation records with full context:

tinbox translate --to es --format markdown --output translation-report.md --model openai:gpt-4o whitepaper.pdf

Share the Markdown report with stakeholders.

Translation Archives

Maintain an archive of all translations with metadata:

# Create archive directory
mkdir -p translations/$$(date +%Y-%m-%d)

# Translate and save report
tinbox translate --to es \
  --format markdown \
  --output translations/$$(date +%Y-%m-%d)/report.md \
  --model openai:gpt-4o \
  document.pdf

Cost Reporting

Generate readable cost reports for accounting:

for doc in *.pdf; do
  tinbox translate --to es \
    --format markdown \
    --output "reports/${doc%.pdf}-es.md" \
    --model openai:gpt-4o \
    "$doc"
done

# Reports include timestamp, cost, and tokens for each translation

Markdown format is perfect for human-readable reports that you can commit to version control or share with teams.

Metadata Structure

All formats (JSON and Markdown) include this metadata:

# From output.py:25-34
class TranslationMetadata(BaseModel):
    """Metadata about the translation process."""

    source_lang: str
    target_lang: str
    model: ModelType
    algorithm: str
    input_file: Path
    input_file_type: FileType
    timestamp: datetime = Field(default_factory=datetime.now)

Fields Explained

source_lang

Source language code (ISO 639-1)Examples: "en", "zh", "es", "auto" (if auto-detected)

target_lang

Target language code (ISO 639-1)Examples: "es", "fr", "de", "ja"

model

Model provider used for translationValues: "openai", "anthropic", "gemini", "ollama"

algorithm

Translation algorithm usedValues: "page", "context-aware", "sliding-window"

input_file

Path to the input file that was translatedExample: "/home/user/documents/report.pdf"

input_file_type

Type of input fileValues: "pdf", "docx", "txt"

timestamp

ISO 8601 timestamp of when translation startedExample: "2026-03-01T14:30:45.123456"

Translation Result Structure

# From output.py:37-43
class TranslationOutput(BaseModel):
    """Complete translation output including metadata."""

    metadata: TranslationMetadata
    result: TranslationResult
    warnings: list[str] = Field(default_factory=list)
    errors: list[str] = Field(default_factory=list)

Result Fields

Field	Type	Description
`text`	string	The complete translated text
`tokens_used`	integer	Total tokens consumed (input + output)
`cost`	float	Total cost in USD
`time_taken`	float	Translation time in seconds
`failed_pages`	integer[]	Array of page numbers that failed (optional)
`page_errors`	object	Map of page numbers to error messages (optional)

Choosing the Right Format

Identify Your Use Case

Human reading: Text or Markdown
Programmatic use: JSON
Documentation: Markdown
Data processing: JSON

Consider Your Workflow

Pipeline integration: Text (easiest to pipe)
API responses: JSON (structured data)
Reports and archives: Markdown (readable + metadata)
Simple translation: Text (clean output)

Select and Test

# Try each format on a small document
tinbox translate --to es --format text --model openai:gpt-4o sample.txt
tinbox translate --to es --format json --model openai:gpt-4o sample.txt
tinbox translate --to es --format markdown --model openai:gpt-4o sample.txt

Format Combination Examples

Save Translation and Report Separately

# 1. Save translated text
tinbox translate --to es \
  --format text \
  --output translated.txt \
  --model openai:gpt-4o \
  document.pdf

# 2. Generate report
tinbox translate --to es \
  --format markdown \
  --output report.md \
  --model openai:gpt-4o \
  document.pdf

This will translate the document twice and charge you double. Better approach:

# Translate once as JSON, then extract text separately
tinbox translate --to es --format json --model openai:gpt-4o document.pdf > result.json
jq -r '.result.text' result.json > translated.txt

Batch Translation with JSON Logging

#!/bin/bash

for doc in *.pdf; do
  echo "Translating $doc..."
  
  # Translate and save both outputs
  tinbox translate --to es \
    --format json \
    --model openai:gpt-4o \
    "$doc" > "logs/${doc%.pdf}.json"
  
  # Extract text from JSON
  jq -r '.result.text' "logs/$${doc%.pdf}.json" > "translated/$${doc%.pdf}_es.txt"
  
  # Log cost
  cost=$$(jq '.result.cost' "logs/$${doc%.pdf}.json")
  echo "$doc: $${cost}" >> cost_log.txt
done

# Calculate total cost
total=$$(jq -s 'map(.result.cost) | add' logs/*.json)
echo "Total cost: $${total}"

Algorithm Comparison

Understand which algorithm produces which output structure

Cost Optimization

Monitor costs using JSON output format

Get Started

Core Concepts

Guides

Advanced

Output Formats

Available Formats

Text

JSON

Markdown

Format Comparison

Text Format

Usage

Implementation

Example Output

Use Cases

JSON Format

Usage

Implementation

Schema

Example Output

Use Cases

Markdown Format

Usage

Implementation

Example Output

Use Cases

Metadata Structure

Fields Explained

Translation Result Structure

Result Fields

Choosing the Right Format

Format Combination Examples

Save Translation and Report Separately

Batch Translation with JSON Logging

Algorithm Comparison

Cost Optimization

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Advanced

​Available Formats

Text

JSON

Markdown

​Format Comparison

​Text Format

​Usage

​Implementation

​Example Output

​Use Cases

​JSON Format

​Usage

​Implementation

​Schema

​Example Output

​Use Cases

​Markdown Format

​Usage

​Implementation

​Example Output

​Use Cases

​Metadata Structure

​Fields Explained

​Translation Result Structure

​Result Fields

​Choosing the Right Format

​Format Combination Examples

​Save Translation and Report Separately

​Batch Translation with JSON Logging

​Related Topics

Algorithm Comparison

Cost Optimization

Build docs developers (and LLMs) love

Available Formats

Format Comparison

Text Format

Usage

Implementation

Example Output

Use Cases

JSON Format

Usage

Implementation

Schema

Example Output

Use Cases

Markdown Format

Usage

Implementation

Example Output

Use Cases

Metadata Structure

Fields Explained

Translation Result Structure

Result Fields

Choosing the Right Format

Format Combination Examples

Save Translation and Report Separately

Batch Translation with JSON Logging

Related Topics