Overview
Memori works seamlessly with streaming responses from LLM providers. Memories are captured even when responses are streamed chunk by chunk.Basic Streaming
- Python
- TypeScript
Streaming with Async
- Python
- TypeScript
Real-Time Web Application
Here’s how to build a streaming chat application with Memori.- Python (FastAPI)
- TypeScript (Express)
Client-Side Integration
Streaming with Anthropic
- Python
- TypeScript
How Memory Works with Streaming
Request Initiated
Memori intercepts the streaming request and recalls relevant memories before the stream begins.
Best Practices
Handle Stream Completion
Always process the stream to completion. Prematurely closing the stream may prevent memory capture.
Wait for Augmentation
For short-lived applications, call
mem.augmentation.wait() after streaming to ensure memories are saved.Error Handling
Wrap streaming logic in try-catch blocks. Network errors during streaming won’t affect memory capture.
Performance
Streaming with Memori adds minimal latency. Memory processing happens asynchronously after the stream.
Next Steps
Async Operations
Learn about async memory operations
Custom Embeddings
Use custom embedding models