Auto-generate your docs

Welcome to ScrapeAccraProperties

ScrapeAccraProperties is a Scrapy + Playwright project for collecting rental listings in Greater Accra from Jiji Ghana and Meqasa. It features an interactive CLI with rich progress UI, JavaScript-rendered scraping, and intelligent resume capabilities.

Installation

Get started with Python 3.12+, install dependencies, and set up Playwright browser

Quick Start

Run your first scrape in minutes with the interactive CLI

Workflow

Learn the two-phase workflow: collect URLs, then scrape listings

Configuration

Configure spiders, customize settings, and optimize performance

Key Features

JavaScript-Rendered Scraping

Uses scrapy-playwright with Chromium to handle dynamically loaded content from modern property listing sites. Asset blocking is enabled for images, media, fonts, and stylesheets to improve performance.

Interactive CLI Runner

Rich progress UI with per-spider summaries guides you through the entire workflow. Choose platforms, configure pagination, and monitor scraping progress all from an intuitive menu system.

Two-Phase Workflow

Phase 1: Collect listing URLs from search/result pages
Phase 2: Visit each listing URL and extract structured dataThis separation allows you to validate URLs before scraping and enables efficient resume operations.

Incremental CSV Output

Listing data is written to CSV files incrementally as items are scraped. URL deduplication ensures no duplicate entries, and all outputs are organized under outputs/ directory.

Resume Mode

Automatically compares URL CSVs to data CSVs and queues only missing URLs. Resume from where you left off without re-scraping existing listings.

Data Cleaning Pipeline

Jiji listings are automatically cleaned after scraping using clean.py, producing a standardized dataset at outputs/data/raw.csv.

Supported Platforms

Jiji Ghana

Scrapes rental listings with detailed property attributes including:

Title, location, and description
House type, bedrooms, and bathrooms
Price and amenities
Custom property attributes

Meqasa

Extracts comprehensive listing information:

Title, price, and rate type
Full description
Dynamic property details from listing tables
Flexible schema adapts to varying attributes

Output Structure

All scraping outputs are automatically organized:

outputs/
├── urls/
│   ├── jiji_urls.csv
│   ├── meqasa_urls.csv
│   └── *_resume_queue.csv (temporary)
└── data/
    ├── jiji_data.csv
    ├── meqasa_data.csv
    └── raw.csv (cleaned Jiji data)

Responsible Use: Review and respect target site Terms of Service and robots.txt. Keep request volume and crawl frequency reasonable. Use scraped data in line with applicable laws and privacy obligations.

Next Steps

Install Now

Set up Python dependencies and Playwright browser

Run Your First Scrape

Follow the quickstart tutorial to collect your first listings

Installation

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Get Started

Core Concepts

Usage

Commands

Configuration

Reference

Introduction

Welcome to ScrapeAccraProperties

Installation

Quick Start

Workflow

Configuration

Key Features

Supported Platforms

Jiji Ghana

Meqasa

Output Structure

Next Steps

Install Now

Run Your First Scrape

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage

Commands

Configuration

Reference

​Welcome to ScrapeAccraProperties

Installation

Quick Start

Workflow

Configuration

​Key Features

​Supported Platforms

Jiji Ghana

Meqasa

​Output Structure

​Next Steps

Install Now

Run Your First Scrape

Build docs developers (and LLMs) love

Welcome to ScrapeAccraProperties

Key Features

Supported Platforms

Output Structure

Next Steps