Installation
Get ScrapeGraphAI up and running in your Python environment. This guide covers installation for pip, virtual environment setup, and post-installation configuration.Requirements
Before installing ScrapeGraphAI, ensure you have:- Python 3.10 or higher (up to Python 3.12)
- pip package manager
- Internet connection for downloading dependencies
It is strongly recommended to install ScrapeGraphAI in a virtual environment to avoid conflicts with other libraries.
Installation Steps
Create a Virtual Environment (Recommended)
Create an isolated Python environment for your project:Alternatively, use
conda if you prefer:Install ScrapeGraphAI
Install the library using pip:This will install ScrapeGraphAI along with its core dependencies including:
langchainand related packagesbeautifulsoup4for HTML parsingplaywrightfor browser automationpydanticfor data validation- Other required dependencies
Install Playwright Browsers
This step is critical for fetching website content. Install Playwright browser binaries:This downloads Chromium, Firefox, and WebKit browsers needed for scraping dynamic websites.
Optional Dependencies
ScrapeGraphAI offers optional features that require additional packages:Burr Integration
For advanced workflow visualization and debugging:NVIDIA AI Integration
For using NVIDIA AI endpoints:OCR Support
For extracting text from images and PDFs:LLM Provider Setup
ScrapeGraphAI works with various LLM providers. You’ll need to set up at least one:OpenAI
- Get an API key from OpenAI Platform
- Set it as an environment variable:
.env file:
Ollama (Local Models)
- Install Ollama from ollama.com
- Download a model:
- Ensure Ollama is running:
Ollama runs locally and doesn’t require an API key, making it great for development and privacy-sensitive applications.
Other Providers
Environment Configuration
Using python-dotenv
Installpython-dotenv to manage environment variables:
.env file in your project root:
Troubleshooting
Import Errors
If you encounter import errors:Playwright Issues
If Playwright browsers are not found:Version Conflicts
If you have dependency conflicts:Telemetry
ScrapeGraphAI collects anonymous usage metrics to improve the library. To opt out:.env file:
Verification Script
Run this script to verify your installation is complete:Next Steps
Quick Start
Now that you have ScrapeGraphAI installed, build your first scraper!
