Skip to main content

Overview

Historia Para Gandules uses Instaloader, a Python library, to automatically scrape video content from the @historiaparagandules Instagram account. The scraper collects metadata, engagement metrics, and video information for analysis and visualization.

Implementation

The scraping implementation is straightforward and focuses on collecting video posts (reels) from the Instagram profile.

Core Script

The main scraping script (scraping5.py) performs the following operations:
scraping5.py
import instaloader
import csv

# Create Instaloader instance
L = instaloader.Instaloader()

# Target Instagram profile
profile_name = "historiaparagandules"
profile = instaloader.Profile.from_username(L.context, profile_name)

# Open CSV file for writing
with open("informacion_reels_simple.csv", mode="w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    
    # Write CSV headers
    writer.writerow(["Fecha", "Texto del reel", "Likes", "Comentarios", "URL del video", 
                     "Visualizaciones", "Duración del video (s)", "URL del Post"])
    
    # Iterate through posts and filter videos
    for post in profile.get_posts():
        if post.is_video:
            fecha = post.date.strftime('%Y-%m-%d %H:%M:%S')
            texto = post.caption or "Sin texto"
            likes = post.likes or 0
            comentarios = post.comments or 0
            url_video = post.video_url or "Sin URL"
            visualizaciones = post.video_view_count or "No disponible"
            duracion_video = post.video_duration or "No disponible"
            url_post = f"https://www.instagram.com/p/{post.shortcode}/"
            
            writer.writerow([fecha, texto, likes, comentarios, url_video, 
                             visualizaciones, duracion_video, url_post])
            print(f"Scrapeando post del {fecha}")

print("Información guardada en 'informacion_reels_simple.csv'")

How It Works

1

Initialize Instaloader

Create an Instaloader instance to interact with Instagram’s public API.
L = instaloader.Instaloader()
2

Load Profile

Fetch the target profile using the username.
profile_name = "historiaparagandules"
profile = instaloader.Profile.from_username(L.context, profile_name)
3

Filter Video Posts

Iterate through all posts and select only videos (reels).
for post in profile.get_posts():
    if post.is_video:
        # Process video post
4

Extract Metadata

For each video post, extract engagement metrics, timestamps, and URLs.
fecha = post.date.strftime('%Y-%m-%d %H:%M:%S')
likes = post.likes or 0
visualizaciones = post.video_view_count or "No disponible"
5

Save to CSV

Write all collected data to a CSV file for further processing.
writer.writerow([fecha, texto, likes, comentarios, url_video, 
                 visualizaciones, duracion_video, url_post])

Key Features

No Authentication Required

Scrapes publicly available Instagram content without login credentials.

Video-Only Filtering

Automatically filters posts to collect only video content (reels).

Comprehensive Metadata

Captures 8 different fields including engagement metrics and timestamps.

CSV Export

Outputs data in a structured CSV format for easy analysis.

Data Collection Fields

The scraper collects the following fields for each video post:
FieldTypeDescription
FechaDateTimePublication timestamp (YYYY-MM-DD HH:MM:SS)
Texto del reelStringCaption/description text
LikesIntegerNumber of likes
ComentariosIntegerNumber of comments
URL del videoStringDirect video file URL
VisualizacionesIntegerVideo view count
Duración del video (s)FloatVideo duration in seconds
URL del PostStringInstagram post permalink
See the Schema documentation for detailed field specifications.

Installation

To run the scraper, install Instaloader:
pip install instaloader

Usage

Run the scraping script:
python scraping5.py
The script will:
  1. Connect to the Historia Para Gandules Instagram profile
  2. Iterate through all posts
  3. Filter video content
  4. Extract metadata and metrics
  5. Save results to informacion_reels_simple.csv
The scraping process may take several minutes depending on the number of posts on the account.

Limitations

Instagram may rate-limit requests. If you encounter errors, consider adding delays between requests or running the scraper less frequently.
  • Public data only: Only publicly available information is collected
  • No authentication: The scraper does not log in to Instagram
  • Rate limiting: Instagram may throttle excessive requests
  • Field availability: Some fields (like view count) may not always be available depending on Instagram’s API changes

Output Format

The scraper generates a CSV file (informacion_reels_simple.csv) with UTF-8 encoding:
Fecha,Texto del reel,Likes,Comentarios,URL del video,Visualizaciones,Duración del video (s),URL del Post
2024-01-15 14:30:00,"Historical content about...",1250,45,https://...,15000,45.5,https://www.instagram.com/p/ABC123/

Next Steps

Data Sources

Learn about the Instagram account and content types

Data Schema

Explore the complete data structure and field specifications

Build docs developers (and LLMs) love