ExtractionLLMBuilder provides a fluent interface for extracting structured data from documents using different extraction strategies. It’s designed for processing large documents and extracting specific information according to a schema.
Configuration Methods
model
Set the LLM model to use for extraction.The model identifier string (e.g., ‘gpt-4o’, ‘claude-3-5-sonnet’) or LLM instance
Returns the builder instance for method chaining
schema
Set the JSON schema for the data to extract.JSON schema array defining the structure of data to extract
Returns the builder instance for method chaining
strategy
Set the extraction strategy to use.Strategy name: ‘simple’, ‘sequential’, ‘sequential-auto-merge’, ‘parallel’, ‘parallel-auto-merge’, ‘double-pass’, ‘double-pass-auto-merge’, or a registered custom strategy
Returns the builder instance for method chaining
instructions
Set custom output instructions for the extraction.Additional instructions to guide the extraction process
Returns the builder instance for method chaining
Artifact Methods
file
Add a file to extract data from.Path to the file to process
Laravel disk name for file storage (null uses default disk)
Whether to replace existing artifacts (true) or add to them (false)
Returns the builder instance for method chaining
files
Add multiple files to extract data from.Array of file paths to process
Whether to replace existing artifacts (true) or add to them (false)
Returns the builder instance for method chaining
artifact
Add a custom artifact to extract data from.An Artifact instance to process
Whether to replace existing artifacts (true) or add to them (false)
Returns the builder instance for method chaining
artifacts
Add multiple custom artifacts.Array of Artifact instances to process
Whether to replace existing artifacts (true) or add to them (false)
Returns the builder instance for method chaining
getArtifacts
Get the current artifacts array.Array of all added artifacts
Processing Options
chunkSize
Set the maximum chunk size in tokens for document splitting.Maximum tokens per chunk, or null to use the default from configuration
Returns the builder instance for method chaining
contextOptions
Set context filtering and processing options.ContextOptions instance defining how to process document context
Returns the builder instance for method chaining
Callback Methods
onMessage
Set a callback for completed messages during extraction.Closure that receives each completed Message
Returns the builder instance for method chaining
onMessageProgress
Set a callback for streaming message progress.Closure that receives partial Messages during streaming
Returns the builder instance for method chaining
onTokenStats
Set a callback for token usage statistics.Closure that receives TokenStats with usage information
Returns the builder instance for method chaining
onDataProgress
Set a callback for extraction data progress.Closure that receives extracted data arrays as they are processed
Returns the builder instance for method chaining
onActorTelemetry
Set a callback for actor telemetry data.Closure that receives ActorTelemetry for each extraction actor
Returns the builder instance for method chaining
Tool Methods
tools
Register tools that the extraction model can call.Array of tools (closures or InvokableTool instances) with string keys as tool names
Returns the builder instance for method chaining
Execution Methods
stream
Execute the extraction with streaming (processes data as it’s extracted).Laravel Collection of extracted data items matching the schema
send
Execute the extraction without streaming (waits for complete results).Laravel Collection of extracted data items matching the schema