Overview
The backend configuration is managed throughconfig.py, which contains all core settings for the Web Scraping Hub application. This file controls application versioning, scraping targets, and API endpoints.
Configuration File Location
The main configuration file is located at:Application Configuration
Version Settings
Current version of the application. This is used for version checking and update notifications.
URL to check for the latest version available on GitHub.
URL to fetch the changelog from GitHub.
Scraping Configuration
Base URL
The base URL for all scraping operations. All target URLs are constructed relative to this base.
Target URLs Configuration
List of scraping targets, each containing a name and URL. This defines all content categories available in the application.Each target is a dictionary with:
nombre(string): Display name for the sectionurl(string): Full URL to scrape
Flask Server Configuration
The Flask server is configured inbackend/app.py with the following settings:
Server Settings
Server host address. Using
0.0.0.0 allows external connections.Port number for the Flask server.
Enable debug mode for development. Should be
false in production.CORS Configuration
Caching Configuration
Maximum age for file caching. Set to 0 to disable caching.
Disable ETags for responses.
HTTP Client Configuration
The application usescloudscraper to bypass anti-bot protection:
Request Timeouts
Default timeout for HTTP requests in seconds.
The timeout for version checks is reduced to 5 seconds to prevent blocking during startup.
Custom Configuration Example
To customize the configuration for your needs:Modifying Server Settings
To change the server port or host, editbackend/app.py:
Configuration Validation
To verify your configuration is working:Related Resources
Target URLs
Learn how to configure scraping targets
Environment Variables
Set up environment-specific configuration