Direct Commands

Bypass the interactive menu and run spiders directly using scrapy crawl commands. This is ideal for automation, scripting, and CI/CD pipelines.

URL Collection Spiders

URL spiders collect property listing URLs from search result pages and write them to CSV files.

Jiji URL Spider

scrapy crawl jiji_urls -a start_page=1

Parameters

start_page

integer

default:"1"

The search results page number to start from. Must be >= 1.

max_pages

integer

Maximum number of pages to scrape. When set, the spider scrapes exactly this many pages starting from start_page.Example: max_pages=5 scrapes 5 pages total.

Cannot be used together with total_listing.

total_listing

integer

Expected total number of listings to collect. The spider calculates how many pages are needed based on Jiji’s listings per page.Example: total_listing=200 might scrape ~9 pages (Jiji shows ~24 listings per page).

Cannot be used together with max_pages.

Output

Writes to: outputs/urls/jiji_urls.csv

url,page,fetch_date
https://jiji.com.gh/accra-metropolitan/houses-apartments-for-rent/...,1,2026-03-03

Meqasa URL Spider

scrapy crawl meqasa_urls -a start_page=1

Parameters

start_page

integer

default:"1"

The search results page number to start from. Must be >= 1.

total_pages

integer

Total number of pages to scrape. When set, the spider scrapes exactly this many pages starting from start_page.Example: total_pages=5 scrapes 5 pages total.

Output

Writes to: outputs/urls/meqasa_urls.csv

url,page,fetch_date
https://meqasa.com/properties-for-rent/ghana/greater-accra/...,1,2026-03-03

When no page control parameter is provided (max_pages, total_pages, or total_listing), spiders auto-detect the total pages by parsing the site’s pagination.

Listing Detail Spiders

Listing spiders read URLs from CSV files and scrape full property details.

Jiji Listing Spider

scrapy crawl jiji_listings -a csv_path=outputs/urls/jiji_urls.csv

Parameters

csv_path

string

required

Path to the CSV file containing URLs to scrape. Must have a url column.Default: outputs/urls/jiji_urls.csvCan be absolute or relative to project root.

Output

Writes to: outputs/data/jiji_data.csv Key fields:

url
fetch_date
title
location
house_type
bedrooms
bathrooms
price
properties (serialized mapping)
amenities (serialized list)
description

After scraping completes, the interactive runner (main.py) automatically runs clean.py to produce a cleaned dataset at outputs/data/raw.csv. This cleaning step does not run when using direct commands.

Meqasa Listing Spider

scrapy crawl meqasa_listings -a csv_path=outputs/urls/meqasa_urls.csv

Parameters

csv_path

string

required

Path to the CSV file containing URLs to scrape. Must have a url column.Default: outputs/urls/meqasa_urls.csvCan be absolute or relative to project root.

Output

Writes to: outputs/data/meqasa_data.csv Base fields:

url
Title
Price
Rate
Description
fetch_date

Additional columns are extracted from listing detail tables and vary by listing.

Custom CSV Paths

You can specify custom paths for both input (URL CSVs) and output (data CSVs) by modifying the spider arguments.

Reading from Custom URL CSV

scrapy crawl jiji_listings -a csv_path=custom/input/my_urls.csv

The CSV must contain a url column. Other columns are ignored.

Example: Full Workflow

Collect Jiji URLs (first 10 pages)

scrapy crawl jiji_urls -a start_page=1 -a max_pages=10

Output: outputs/urls/jiji_urls.csv

Scrape Jiji listing details

scrapy crawl jiji_listings -a csv_path=outputs/urls/jiji_urls.csv

Output: outputs/data/jiji_data.csv

Collect Meqasa URLs (auto-detect pages)

scrapy crawl meqasa_urls -a start_page=1

Output: outputs/urls/meqasa_urls.csv

Scrape Meqasa listing details

scrapy crawl meqasa_listings -a csv_path=outputs/urls/meqasa_urls.csv

Output: outputs/data/meqasa_data.csv

Automation Example

Create a bash script to run the full workflow:

scrape.sh

#!/bin/bash

set -e

echo "Collecting URLs..."
scrapy crawl jiji_urls -a start_page=1 -a max_pages=20
scrapy crawl meqasa_urls -a start_page=1

echo "Scraping listing details..."
scrapy crawl jiji_listings -a csv_path=outputs/urls/jiji_urls.csv
scrapy crawl meqasa_listings -a csv_path=outputs/urls/meqasa_urls.csv

echo "Done!"

Run it:

chmod +x scrape.sh
./scrape.sh

Error Handling

If the URL CSV file doesn’t exist or is empty, the listing spider will exit with an error message.

Common errors:

No URLs found in outputs/urls/jiji_urls.csv

Solution: Run the URL collection spider first.

FileNotFoundError: outputs/urls/jiji_urls.csv

Solution: Verify the path is correct and the file exists.

CSV Output Behavior

Incremental writes: Each scraped item is written immediately (item-by-item)
URL deduplication: The same URL is not added twice to URL CSVs
Append mode: Listing spiders append to existing data CSVs by default
Auto-created directories: outputs/urls/ and outputs/data/ are created automatically if missing

For completely fresh data, delete the existing CSV files before running the spiders.

Get Started

Core Concepts

Usage

Commands

Configuration

Reference

URL Collection Spiders

Jiji URL Spider

Parameters

Output

Meqasa URL Spider

Parameters

Output

Listing Detail Spiders

Jiji Listing Spider

Parameters

Output

Meqasa Listing Spider

Parameters

Output

Custom CSV Paths

Reading from Custom URL CSV

Example: Full Workflow

Automation Example

Error Handling

CSV Output Behavior

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage

Commands

Configuration

Reference

​URL Collection Spiders

​Jiji URL Spider

​Parameters

​Output

​Meqasa URL Spider

​Parameters

​Output

​Listing Detail Spiders

​Jiji Listing Spider

​Parameters

​Output

​Meqasa Listing Spider

​Parameters

​Output

​Custom CSV Paths

​Reading from Custom URL CSV

​Example: Full Workflow

​Automation Example

​Error Handling

​CSV Output Behavior

Build docs developers (and LLMs) love

URL Collection Spiders

Jiji URL Spider

Parameters

Output

Meqasa URL Spider

Parameters

Output

Listing Detail Spiders

Jiji Listing Spider

Parameters

Output

Meqasa Listing Spider

Parameters

Output

Custom CSV Paths

Reading from Custom URL CSV

Example: Full Workflow

Automation Example

Error Handling

CSV Output Behavior