Translation Algorithms

Tinbox provides a flexible translation system with multiple algorithms for different use cases. The system uses a protocol-based interface for translator implementations.

ModelInterface

Protocol defining the interface for LLM translation models.

from typing import Protocol
from tinbox.core.translation import ModelInterface, TranslationRequest, TranslationResponse

class ModelInterface(Protocol):
    async def translate(
        self,
        request: TranslationRequest,
    ) -> TranslationResponse:
        """Translate content using the model."""
        ...

    async def validate_model(self) -> bool:
        """Check if the model is available and properly configured."""
        ...

translate

Translate content according to a translation request.

request

TranslationRequest

required

Translation request containing source/target languages, content, and configuration.

return

TranslationResponse

Translation response with translated text, token usage, cost, and timing information.

Show Raises

TranslationError

Raised if translation fails due to API errors, network issues, or invalid configuration.

validate_model

Validate that the model is properly configured and accessible.

return

bool

True if the model is available and can be used for translation.

TranslationRequest

Configuration for a single translation request.

from tinbox.core.translation import TranslationRequest
from tinbox.core.types import ModelType

# Basic text translation
request = TranslationRequest(
    source_lang="en",
    target_lang="fr",
    content="Hello, world!",
    content_type="text/plain",
    model=ModelType.OPENAI,
    model_params={"model_name": "gpt-4o"},
)

# Translation with context
request = TranslationRequest(
    source_lang="en",
    target_lang="de",
    content="He said it was great.",
    context="[PREVIOUS_CHUNK]\nThe restaurant opened yesterday.\n[/PREVIOUS_CHUNK]",
    content_type="text/plain",
    model=ModelType.ANTHROPIC,
    model_params={"model_name": "claude-3-sonnet"},
)

# Translation with glossary and reasoning
from tinbox.core.types import Glossary, GlossaryEntry

glossary = Glossary(entries={"API": "API", "cloud": "Cloud"})

request = TranslationRequest(
    source_lang="en",
    target_lang="ja",
    content="The API connects to the cloud.",
    content_type="text/plain",
    model=ModelType.GEMINI,
    model_params={"model_name": "gemini-2.5-pro"},
    glossary=glossary,
    reasoning_effort="high",
)

source_lang

str

required

Source language code (e.g., "en", "fr", "ja").

target_lang

str

required

Target language code (e.g., "de", "es", "zh").

content

str | bytes

required

Content to translate:

str for text content
bytes for image content (PNG format for scanned documents)

content_type

str

required

MIME type of the content. Must match pattern ^(text|image)/.+$.

"text/plain" - Plain text content
"image/png" - Image content (scanned PDFs)

model

ModelType

required

Model provider to use. Options:

ModelType.OPENAI - OpenAI models
ModelType.ANTHROPIC - Anthropic Claude models
ModelType.GEMINI - Google Gemini models
ModelType.OLLAMA - Local Ollama models

context

str | None

default:"None"

Optional context information to improve translation quality and consistency.Context-aware algorithm provides:

[PREVIOUS_CHUNK] tags with previous content
[PREVIOUS_CHUNK_TRANSLATION] tags with previous translation
[NEXT_CHUNK] tags with upcoming content

context = """[PREVIOUS_CHUNK]
The meeting started at 9 AM.
[/PREVIOUS_CHUNK]

[PREVIOUS_CHUNK_TRANSLATION]
Die Besprechung begann um 9 Uhr.
[/PREVIOUS_CHUNK_TRANSLATION]

Use this context to maintain consistency in terminology and style."""

model_params

dict

default:"{}"

Additional model-specific parameters. Common parameters:

model_name: Specific model to use (e.g., "gpt-4o", "claude-3-sonnet")
temperature: Sampling temperature (if supported)
max_tokens: Maximum output tokens (if supported)

glossary

Glossary | None

default:"None"

Optional glossary for consistent term translations. The model will use these terms when translating.

from tinbox.core.types import Glossary

glossary = Glossary(entries={
    "API": "API",
    "cloud computing": "Cloud-Computing",
    "database": "Datenbank",
})

reasoning_effort

Literal['minimal', 'low', 'medium', 'high']

default:"minimal"

Model reasoning effort level:

"minimal" - Fast, cost-effective
"low" - Slight improvement, moderate cost increase
"medium" - Better quality, higher cost
"high" - Best quality, significantly higher cost

Higher reasoning efforts can multiply costs by 3-10x.

TranslationRequest is immutable (frozen=True).

TranslationResponse

Response from a translation request or algorithm.

from tinbox import translate_document, load_document, create_translator, TranslationConfig
from pathlib import Path

content = await load_document(Path("document.pdf"))
config = TranslationConfig(...)
translator = create_translator(config)

response = await translate_document(content, config, translator)

print(f"Translated text: {response.text[:100]}...")
print(f"Tokens used: {response.tokens_used:,}")
print(f"Cost: ${response.cost:.4f}")
print(f"Time taken: {response.time_taken:.2f}s")

if response.failed_pages:
    print(f"Failed pages: {response.failed_pages}")
    for page, error in response.page_errors.items():
        print(f"  Page {page}: {error}")

if response.warnings:
    for warning in response.warnings:
        print(f"Warning: {warning}")

if response.glossary_updates:
    print(f"New glossary entries: {len(response.glossary_updates)}")
    for entry in response.glossary_updates:
        print(f"  {entry.term} -> {entry.translation}")

text

str

required

The translated text. For page-by-page algorithm with failed pages, contains placeholders:

[TRANSLATION_FAILED: Page 3]
Reason: API timeout
[/TRANSLATION_FAILED]

tokens_used

int

required

Total number of tokens used (input + output). Must be >= 0.

cost

float

required

Total cost in USD. Must be >= 0.0.

time_taken

float

required

Time taken in seconds. Must be >= 0.0.

glossary_updates

list[GlossaryEntry]

default:"[]"

New glossary entries discovered during translation (when glossary is enabled).Each entry contains:

term: Term in source language
translation: Translation in target language

failed_pages

list[int]

default:"[]"

List of page numbers that failed to translate (page-by-page algorithm only).Page numbers are 1-indexed.

page_errors

dict[int, str]

default:"{}"

Mapping from page number to error message for failed pages.

if response.page_errors:
    for page, error in response.page_errors.items():
        print(f"Page {page} failed: {error}")

warnings

list[str]

default:"[]"

Non-fatal warnings during translation.Common warnings:

Incomplete translation due to failed pages
Cost approaching threshold
Algorithm-specific issues

TranslationResponse is immutable (frozen=True).