Welcome to ScrapeAccraProperties
ScrapeAccraProperties is a Scrapy + Playwright project for collecting rental listings in Greater Accra from Jiji Ghana and Meqasa. It features an interactive CLI with rich progress UI, JavaScript-rendered scraping, and intelligent resume capabilities.Installation
Get started with Python 3.12+, install dependencies, and set up Playwright browser
Quick Start
Run your first scrape in minutes with the interactive CLI
Workflow
Learn the two-phase workflow: collect URLs, then scrape listings
Configuration
Configure spiders, customize settings, and optimize performance
Key Features
JavaScript-Rendered Scraping
JavaScript-Rendered Scraping
Uses
scrapy-playwright with Chromium to handle dynamically loaded content from modern property listing sites. Asset blocking is enabled for images, media, fonts, and stylesheets to improve performance.Interactive CLI Runner
Interactive CLI Runner
Rich progress UI with per-spider summaries guides you through the entire workflow. Choose platforms, configure pagination, and monitor scraping progress all from an intuitive menu system.
Two-Phase Workflow
Two-Phase Workflow
Phase 1: Collect listing URLs from search/result pages
Phase 2: Visit each listing URL and extract structured dataThis separation allows you to validate URLs before scraping and enables efficient resume operations.
Phase 2: Visit each listing URL and extract structured dataThis separation allows you to validate URLs before scraping and enables efficient resume operations.
Incremental CSV Output
Incremental CSV Output
Listing data is written to CSV files incrementally as items are scraped. URL deduplication ensures no duplicate entries, and all outputs are organized under
outputs/ directory.Resume Mode
Resume Mode
Automatically compares URL CSVs to data CSVs and queues only missing URLs. Resume from where you left off without re-scraping existing listings.
Data Cleaning Pipeline
Data Cleaning Pipeline
Jiji listings are automatically cleaned after scraping using
clean.py, producing a standardized dataset at outputs/data/raw.csv.Supported Platforms
Jiji Ghana
Scrapes rental listings with detailed property attributes including:
- Title, location, and description
- House type, bedrooms, and bathrooms
- Price and amenities
- Custom property attributes
Meqasa
Extracts comprehensive listing information:
- Title, price, and rate type
- Full description
- Dynamic property details from listing tables
- Flexible schema adapts to varying attributes
Output Structure
All scraping outputs are automatically organized:Responsible Use: Review and respect target site Terms of Service and
robots.txt. Keep request volume and crawl frequency reasonable. Use scraped data in line with applicable laws and privacy obligations.Next Steps
Install Now
Set up Python dependencies and Playwright browser
Run Your First Scrape
Follow the quickstart tutorial to collect your first listings