extract-file command (or simply struktur with no command) processes input files, text, or artifacts and extracts structured data according to a JSON schema using AI models.
Usage
Input options
You must specify exactly one input source:Path to an input file to parse and extract from
Raw text input to extract from
Read raw text from stdin (auto-detected when piped)
Path to artifact JSON file, or
- to read from stdinArtifact JSON as an inline string
Schema options
You must specify a JSON schema for extraction:Path to JSON schema file or HTTP(S) URL. Remote schemas are fetched with
application/schema+json accept header.JSON schema as an inline string
Model and strategy options
Model identifier in the format
provider/model (e.g., openai/gpt-4o, anthropic/claude-4-sonnet). If not specified, uses the configured default model or the cheapest model from the first configured provider.Extraction strategy to use. Available strategies:
simple- Single-pass extraction (default)parallel- Parallel chunked extraction with mergesequential- Sequential chunked extractionparallelAutoMerge- Parallel chunks with automatic deduplicationsequentialAutoMerge- Sequential chunks with automatic deduplicationdoublePass- Two-pass extraction with mergedoublePassAutoMerge- Two-pass extraction with automatic deduplication
Token budget per batch for chunked strategies. Only applies to strategies that support chunking.
Output options
Output path for extracted JSON. Use
- for stdout (default).Examples
Extract from a text file
Extract from stdin with inline schema
Use a chunking strategy for large documents
Extract from raw text
Fetch schema from URL
Progress output
When stderr is a TTY, extraction progress is displayed:Error handling
Schema validation errors
If the extracted data fails schema validation, the error details are displayed:Environment variables
Provider API keys can be set via environment variables:OPENAI_API_KEY- OpenAI API keyANTHROPIC_API_KEY- Anthropic API keyGOOGLE_GENERATIVE_AI_API_KEY- Google Generative AI API keyOPENCODE_API_KEY- OpenCode API keyOPENROUTER_API_KEY- OpenRouter API keySTRUKTUR_CONFIG_DIR- Override config directory (default:~/.config/struktur)AI_SDK_LOG_WARNINGS- Enable AI SDK warnings (default:false)
struktur auth set take precedence over environment variables.