Overview
The Docling Haystack integration provides:- Document conversion for Haystack pipelines
- Support for multiple document formats (PDF, DOCX, PPTX, etc.)
- High-fidelity table and layout extraction
- Easy integration with existing Haystack workflows
Installation
Quick Start
Here’s a simple example of using Docling in a Haystack pipeline:Building a RAG Pipeline
Advanced Configuration
Custom Conversion Options
Batch Processing
Features
Pipeline Integration
Seamlessly integrates into Haystack pipelines
Multi-Format Support
Supports PDF, DOCX, PPTX, HTML, and more
Table Extraction
Accurately extracts table structures
OCR Support
Process scanned documents and images
Complete RAG Application
Use Cases
Resources
Documentation
Official Haystack integration docs
GitHub
Source code and examples
Example Notebook
Complete RAG example
PyPI
Package repository
Next Steps
- Explore the Haystack documentation
- Check out the example notebook
- Learn about pipeline options
- Build your first Haystack application