Using the CLI

The Struktur CLI provides a complete interface for running extractions, managing provider tokens, and querying available models without writing code.

Installation

The CLI is included with the Struktur package:

bun install @mateffy/struktur

Run directly with bun:

bun struktur --help

Or use the global binary if installed:

struktur --help

Commands

Strukt provides four main commands:

extract-file: Extract structured data (default command)
verify: Validate artifact JSON format
auth: Manage provider API tokens
models: List available models

extract-file command

The default command for running extractions.

Basic usage

struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --model openai/gpt-4o-mini

Input options

Strukt accepts multiple input formats:

# Extract from a file
struktur extract-file \
  --input document.pdf \
  --schema schema.json

Schema options

# Schema from file
struktur extract-file \
  --input document.pdf \
  --schema schema.json

Model selection

# Explicit model
struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --model anthropic/claude-3-5-haiku-20241022

# Use configured default (if set)
struktur extract-file \
  --input document.pdf \
  --schema schema.json

# Auto-select cheapest from configured providers
# (no --model flag, no configured default)
struktur extract-file \
  --input document.pdf \
  --schema schema.json

Model format: provider/model-name Examples:

openai/gpt-4o-mini
anthropic/claude-3-5-haiku-20241022
google/gemini-1.5-flash
openrouter/anthropic/claude-3.5-sonnet
opencode/gpt-5-nano

Strategy selection

# Simple strategy (default)
struktur extract-file \
  --input document.pdf \
  --schema schema.json

# Parallel strategy
struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --strategy parallel \
  --chunk-size 10000

# Sequential strategy
struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --strategy sequential \
  --chunk-size 8000

# Auto-merge strategies
struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --strategy parallelAutoMerge \
  --chunk-size 10000

# Double-pass strategies
struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --strategy doublePass \
  --chunk-size 10000

Available strategies:

simple: Single-pass extraction (default)
parallel: Concurrent batches with LLM merge
sequential: Sequential batches with context
parallelAutoMerge: Parallel with schema-aware merge
sequentialAutoMerge: Sequential with schema-aware merge
doublePass: Parallel then sequential refinement
doublePassAutoMerge: Auto-merge then sequential refinement

Output options

# Output to stdout (default)
struktur extract-file \
  --input document.pdf \
  --schema schema.json

# Output to file
struktur extract-file \
  --input document.pdf \
  --schema schema.json \
  --output result.json

# Pipe to another command
struktur extract-file \
  --input document.pdf \
  --schema schema.json | jq '.title'

Chunk size configuration

For strategies that support chunking:

struktur extract-file \
  --input large-document.pdf \
  --schema schema.json \
  --strategy parallel \
  --chunk-size 15000

Default chunk size: 10,000 tokens Adjust based on:

Model context window size
Document complexity
Cost vs. accuracy tradeoffs

Complete example

struktur extract-file \
  --input research-paper.pdf \
  --schema paper-schema.json \
  --model anthropic/claude-3-5-haiku-20241022 \
  --strategy parallel \
  --chunk-size 12000 \
  --output extracted-data.json

auth command

Manage API tokens for AI providers.

auth set

Store a provider token:

Set token with command-line argument

struktur auth set --provider openai --token sk-...

Or read from stdin

echo "sk-..." | struktur auth set --provider openai --token-stdin

Set as default provider

struktur auth set --provider openai --token sk-... --default

This automatically configures the cheapest model from that provider as your default.

Storage options:

# Auto-detect storage (default)
struktur auth set --provider openai --token sk-...

# Force keychain storage (macOS only)
struktur auth set --provider openai --token sk-... --storage keychain

# Force file storage
struktur auth set --provider openai --token sk-... --storage file

Supported providers:

openai
anthropic
google
opencode
openrouter

auth default

Set or update your default model:

# Use cheapest model from a provider
struktur auth default openai

# Use a specific model
struktur auth default --model anthropic/claude-3-5-haiku-20241022

Output:

{
  "defaultModel": "openai/gpt-4o-mini"
}

auth get

Retrieve a stored token:

# Masked output (default)
struktur auth get --provider openai
# Output: sk-ab...xy12

# Raw token
struktur auth get --provider openai --raw
# Output: sk-abcdef1234567890...

auth list

List all configured providers:

struktur auth list

Output:

{
  "providers": [
    { "provider": "openai", "storage": "keychain" },
    { "provider": "anthropic", "storage": "file" },
    { "provider": "google", "storage": "keychain" }
  ]
}

auth delete

Remove a provider token:

struktur auth delete --provider openai

Output:

{
  "provider": "openai",
  "deleted": true
}

models command

Query available models from providers.

List all providers

struktur models --all

Output:

{
  "providers": [
    {
      "provider": "openai",
      "ok": true,
      "models": [
        "gpt-4o",
        "gpt-4o-mini",
        "gpt-4-turbo",
        "gpt-3.5-turbo"
      ]
    },
    {
      "provider": "anthropic",
      "ok": true,
      "models": [
        "claude-3-5-sonnet-20241022",
        "claude-3-5-haiku-20241022",
        "claude-3-opus-20240229"
      ]
    },
    {
      "provider": "google",
      "ok": false,
      "error": "No token available"
    }
  ]
}

List specific provider

struktur models --provider openai

Output:

{
  "providers": [
    {
      "provider": "openai",
      "ok": true,
      "models": [
        "gpt-4o",
        "gpt-4o-mini",
        "gpt-4-turbo",
        "gpt-3.5-turbo"
      ]
    }
  ]
}

verify command

Validate artifact JSON format.

# Verify file
struktur verify --input artifacts.json

# Verify from stdin
cat artifacts.json | struktur verify

Output on success:

{
  "valid": true,
  "artifacts": 3
}

Output on failure:

Error: Invalid artifact format: ...

Progress indicators

The CLI shows automatic progress bars when running in a TTY:

◈ ▰▰▰▰▰▰▰▰▰▰▱▱▱▱▱▱▱▱▱▱ 50% | processing 5/10

Progress is hidden when:

Output is redirected to a file
Piping to another command
Running in non-interactive mode

Environment variables

The CLI respects standard environment variables:

# Provider API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_GENERATIVE_AI_API_KEY="..."
export OPENROUTER_API_KEY="sk-or-..."
export OPENCODE_API_KEY="..."

# Config directory (default: ~/.config/struktur)
export STRUKTUR_CONFIG_DIR="/custom/path"

# Disable keychain on macOS
export STRUKTUR_DISABLE_KEYCHAIN=1

# Custom keychain service name (default: struktur)
export STRUKTUR_KEYCHAIN_SERVICE="my-app"

Complete workflow example

Configure your provider

struktur auth set --provider anthropic --token sk-ant-... --default

Create a schema file

schema.json

{
  "type": "object",
  "properties": {
    "title": { "type": "string" },
    "authors": {
      "type": "array",
      "items": { "type": "string" }
    },
    "abstract": { "type": "string" },
    "citations": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "title": { "type": "string" },
          "year": { "type": "number" }
        },
        "required": ["title"]
      }
    }
  },
  "required": ["title", "authors"],
  "additionalProperties": false
}

Run the extraction

struktur extract-file \
  --input research-paper.pdf \
  --schema schema.json \
  --strategy parallel \
  --chunk-size 12000 \
  --output result.json

Process the results

# Pretty-print results
cat result.json | jq .

# Extract specific fields
cat result.json | jq '.title, .authors'

# Count citations
cat result.json | jq '.citations | length'

Error handling

The CLI exits with status code 1 on errors and writes error messages to stderr:

struktur extract-file --input missing.pdf --schema schema.json
# Error: ENOENT: no such file or directory
# Exit code: 1

struktur auth get --provider unconfigured
# Error: No token stored for provider: unconfigured
# Exit code: 1

This allows for proper error handling in scripts:

if struktur extract-file --input doc.pdf --schema schema.json > output.json; then
  echo "Extraction succeeded"
  cat output.json | jq .
else
  echo "Extraction failed"
  exit 1
fi

Tips and tricks

Batch processing

for file in documents/*.pdf; do
  name=$(basename "$file" .pdf)
  struktur extract-file \
    --input "$file" \
    --schema schema.json \
    --output "results/${name}.json"
done

Using with jq for schema generation

# Generate schema from example JSON
echo '{"title": "Example", "count": 42}' | \
  jq '{type: "object", properties: (to_entries | map({key: .key, value: {type: (.value | type)}}) | from_entries), required: keys}' > schema.json

Piping between commands

# Extract, then filter, then format
struktur extract-file \
  --input document.pdf \
  --schema schema.json | \
  jq '.citations[] | select(.year > 2020)' | \
  jq -s '.' > recent-citations.json

Get Started

Core Concepts

Guides

Examples

Installation

Commands

extract-file command

Basic usage

Input options

Schema options

Model selection

Strategy selection

Output options

Chunk size configuration

Complete example

auth command

auth set

auth default

auth get

auth list

auth delete

models command

List all providers

List specific provider

verify command

Progress indicators

Environment variables

Complete workflow example

Error handling

Tips and tricks

Batch processing

Using with jq for schema generation

Piping between commands

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

​Installation

​Commands

​extract-file command

​Basic usage

​Input options

​Schema options

​Model selection

​Strategy selection

​Output options

​Chunk size configuration

​Complete example

​auth command

​auth set

​auth default

​auth get

​auth list

​auth delete

​models command

​List all providers

​List specific provider

​verify command

​Progress indicators

​Environment variables

​Complete workflow example

​Error handling

​Tips and tricks

​Batch processing

​Using with jq for schema generation

​Piping between commands

Build docs developers (and LLMs) love

Installation

Commands

extract-file command

Basic usage

Input options

Schema options

Model selection

Strategy selection

Output options

Chunk size configuration

Complete example

auth command

auth set

auth default

auth get

auth list

auth delete

models command

List all providers

List specific provider

verify command

Progress indicators

Environment variables

Complete workflow example

Error handling

Tips and tricks

Batch processing

Using with jq for schema generation

Piping between commands