Overview
The Bright Data MCP Server provides:- Real-time Web Access - Access up-to-date information directly from the web
- Bypass Geo-restrictions - Access content regardless of location constraints
- Web Unlocker Technology - Navigate websites with advanced bot detection protection
- Browser Control - Optional remote browser automation capabilities
- Seamless Integration - Works with all MCP-compatible AI assistants
Prerequisites
- Node.js 20+ and pnpm installed
- Bright Data account (sign up - new users get free credits)
Environment Variables
Backend Configuration
Add these environment variables to your backend.env file:
backend/.env
.env file as BRIGHT_DATA_API_TOKEN.env file as BRIGHT_DATA_BROWSER_AUTHBy default, DecipherIt creates a Web Unlocker zone automatically using your API token. For advanced use cases:
Integration Implementation
The Bright Data MCP Server is integrated using CrewAI’sMCPServerAdapter:
backend/agents/topic_research_agent.py
Available Tools
DecipherIt leverages two key tools from the Bright Data MCP server:search_engine
Search the web for relevant information and discover sources:backend/config/topic_research/tasks.py
scrape_as_markdown
Extract and convert web content to clean, structured Markdown format:backend/config/topic_research/tasks.py
Multi-Agent Workflow
DecipherIt uses parallel execution for efficient scraping:backend/agents/topic_research_agent.py
AI Agents Using Bright Data
Several specialized agents use Bright Data tools:Web Scraping Planner
Role: Web Scraping Strategy ExpertGoal: Design optimal web scraping plans with targeted search queries to comprehensively gather relevant information.Capabilities: Creates strategic search patterns that ensure comprehensive coverage while avoiding redundancy.
Link Collector Agent
Role: Link Discovery SpecialistTools:
search_engineGoal: Discover and curate the most comprehensive and relevant collection of web sources.Capabilities: Uses Bright Data’s search engine to find authoritative sources globally, bypassing geo-restrictions.Web Scraper Agent
Role: Web Scraping EngineerTools:
scrape_as_markdownGoal: Navigate complex websites and extract targeted information while maintaining data integrity.Capabilities: Uses Bright Data’s Web Unlocker to extract clean, structured content from discovered URLs.Security Best Practices
DecipherIt automatically implements security measures:- Data Validation - Filters and validates all web data before processing
- Structured Extraction - Uses structured data extraction rather than raw text
- Rate Limiting - Implements rate limiting and error handling
- Error Recovery - Gracefully handles scraping failures with retries
backend/agents/topic_research_agent.py
Monitoring and Logging
DecipherIt logs all scraping operations for debugging:backend/agents/topic_research_agent.py
Troubleshooting
Connection Issues
If you encounter connection errors:- Verify your
BRIGHT_DATA_API_TOKENis correct - Check that
BRIGHT_DATA_BROWSER_AUTHis properly configured - Ensure pnpm is installed and accessible:
pnpm --version
Rate Limiting
The system includes built-in rate limiting:backend/agents/topic_research_agent.py
Scraping Failures
If specific URLs fail to scrape:- Check the logs in
logs/web_scraping_crew_*.log - Verify the URL is accessible
- The system will retry up to 5 times automatically
Next Steps
- Learn about AI Model Integrations
- Explore Storage Configuration
- Review Architecture Overview