Overview
Streaming allows your application to receive and display AI responses in real-time as they’re generated, rather than waiting for the complete response. This creates a much better user experience, especially for long responses.Quick Start
Basic Streaming
Use thestream() method instead of send() to enable streaming:
Difference: stream() vs send()
Real-time Progress
onMessageProgress Callback
Receive chunks as they arrive:Complete Message Callback
Get notified when complete messages are received:Streaming in Web Applications
Laravel HTTP Streaming
Stream responses directly to the browser:JavaScript Client (Server-Sent Events)
Consume the stream in your frontend:React Example
Use streaming in a React component:Streaming with Tool Calls
Streaming works seamlessly with tool calls:Tool Call Progress
Monitor tool execution during streaming:Advanced Streaming
Custom Data Packets
Access raw streaming data:Token Statistics During Streaming
Track token usage as the response streams:Error Handling
Handle errors during streaming:Complete Streaming Example
Here’s a complete Laravel controller with streaming:Frontend for Complete Example
Best Practices
Always Use Streaming
Use
stream() for better UX, especially for responses longer than a few sentences.Flush Output
Call
flush() after echoing content to ensure immediate browser delivery.Handle Errors
Wrap streaming in try-catch and provide user feedback on errors.
Set Headers
Use proper headers for Server-Sent Events:
text/event-stream and no-cache.Performance Considerations
Optimize Chunk Size
Balance between responsiveness and overhead - very small chunks increase overhead.
Troubleshooting
Stream Not Updating in Browser
Make sure you’re flushing output:Nginx Buffering
Disable nginx buffering for streaming endpoints:Large Responses Timeout
Increase PHP execution time for long streams:Next Steps
Chat API
Learn more about the Chat API features
Extraction
Extract structured data from documents
Embeddings
Generate embeddings for semantic search
API Reference
Explore the complete API documentation