Skip to main content

Overview

Historia Para Gandules is an interactive geospatial visualization platform that maps historical Instagram content from the Canary Islands. This quickstart guide will walk you through scraping data, generating an interactive map, and performing basic analysis.
Make sure you’ve completed the Installation steps before proceeding.

Quick Start Workflow

1

Scrape Instagram Data

Use the scraper to download posts from the Historia Para Gandules Instagram account.
2

Generate Interactive Map

Create a Folium-based interactive map with geolocated posts.
3

Perform Exploratory Analysis

Analyze engagement metrics and content patterns using the Jupyter notebook.

Step 1: Scraping Instagram Data

The scraping5.py script uses Instaloader to download video posts and metadata from the Instagram profile.

Run the Scraper

python scraping5.py

What It Does

The scraper collects the following information for each video post:
  • Date of publication
  • Post caption/text
  • Number of likes
  • Number of comments
  • Video URL
  • View count
  • Video duration
  • Post URL

Code Example

import instaloader
import csv

# Create an instance of Instaloader
L = instaloader.Instaloader()

# Instagram profile name
profile_name = "historiaparagandules"

# Get the profile
profile = instaloader.Profile.from_username(L.context, profile_name)

# Create CSV file
with open("informacion_reels_simple.csv", mode="w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerow(["Fecha", "Texto del reel", "Likes", "Comentarios", "URL del video", 
                     "Visualizaciones", "Duración del video (s)", "URL del Post"])

    # Iterate over posts and write to CSV
    for post in profile.get_posts():
        if post.is_video:  # Filter only videos
            fecha = post.date.strftime('%Y-%m-%d %H:%M:%S')
            texto = post.caption or "Sin texto"
            likes = post.likes or 0
            comentarios = post.comments or 0
            url_video = post.video_url or "Sin URL"
            visualizaciones = post.video_view_count or "No disponible"
            duracion_video = post.video_duration or "No disponible"
            url_post = f"https://www.instagram.com/p/{post.shortcode}/"

            writer.writerow([fecha, texto, likes, comentarios, url_video, 
                             visualizaciones, duracion_video, url_post])
            print(f"Scrapeando post del {fecha}")

print("Información guardada en 'informacion_reels_simple.csv'")

Output

The scraper creates a CSV file named informacion_reels_simple.csv with all the collected data.
Instagram may rate-limit requests. If you encounter errors, wait a few minutes and try again. Consider adding delays between requests for large-scale scraping.

Step 2: Generating the Interactive Map

The mapita5.py script creates an interactive Folium map with markers for each geolocated post.

Prepare Your Data

Before running the map generator, ensure you have an Excel file (excel_info_1.xlsx) with the following columns:
  • Localización - Coordinates in “latitude,longitude” format (e.g., “28.0,-15.0”)
  • Texto del reel - Post caption
  • URL de imagen - Image URL for the thumbnail
  • URL del Post - Instagram post URL

Run the Map Generator

python mapita5.py

Code Example

import folium
import pandas as pd
import requests
import os

# Create directory for images
if not os.path.exists("imagenes"):
    os.makedirs("imagenes")

# Load the Excel file
df = pd.read_excel('excel_info_1.xlsx')

# Function to get coordinates
def obtener_coordenadas(localizacion):
    try:
        lat, lon = map(float, localizacion.split(','))
        return lat, lon
    except Exception as e:
        print(f"Error al procesar la ubicación: {localizacion} - {e}")
        return None, None

# Download and store images locally
def descargar_imagen(url, index):
    try:
        response = requests.get(url, stream=True)
        if response.status_code == 200:
            ruta_imagen = f"imagenes/imagen_{index}.jpg"
            with open(ruta_imagen, 'wb') as file:
                for chunk in response.iter_content(1024):
                    file.write(chunk)
            return ruta_imagen
        else:
            print(f"No se pudo descargar la imagen: {url}")
            return None
    except Exception as e:
        print(f"Error al descargar la imagen: {url} - {e}")
        return None

# Create the map
m = folium.Map(location=[28.0, -15.0], zoom_start=6)

# Iterate over the DataFrame
for index, row in df.iterrows():
    lat, lon = obtener_coordenadas(row['Localización'])
    if lat is not None and lon is not None:
        ruta_imagen = descargar_imagen(row['URL de imagen'], index)
        if ruta_imagen:
            popup_content = f"""
            <div>
                <h4>{row['Texto del reel'].split(' ')[0]}</h4>
                <img src="{ruta_imagen}" alt="Imagen del Reel" style="width:200px;height:auto;">
                <a href="{row['URL del Post']}">Ver publicación</a>
            </div>
            """
            folium.Marker(
                location=[lat, lon],
                popup=folium.Popup(popup_content, max_width=300),
            ).add_to(m)

# Save the map as an HTML file
m.save("mapa_ubicaciones_reels_with_thumbnails.html")
print("Mapa generado con imágenes y enlaces: 'mapa_ubicaciones_reels_with_thumbnails.html'")

View Your Map

Open the generated HTML file in your browser:
# macOS
open mapa_ubicaciones_reels_with_thumbnails.html

# Linux
xdg-open mapa_ubicaciones_reels_with_thumbnails.html

# Windows
start mapa_ubicaciones_reels_with_thumbnails.html

Map Features

  • Interactive Markers - Click markers to view post thumbnails and details
  • Thumbnails - Each marker displays a small image preview
  • Direct Links - Click “Ver publicación” to open the post on Instagram
  • Zoom Controls - Navigate the map centered on the Canary Islands
The map is centered at coordinates [28.0, -15.0] with a zoom level of 6, ideal for viewing the Canary Islands archipelago.

Step 3: Exploratory Data Analysis

The EDA.ipynb Jupyter notebook provides comprehensive analysis of engagement metrics and content patterns.

Launch Jupyter Notebook

jupyter notebook EDA.ipynb

Analysis Capabilities

The notebook includes:
  1. Descriptive Statistics - Summary statistics for likes, comments, views, and video duration
  2. Top Performing Content - Identification of posts with highest engagement
  3. Category Analysis - Breakdown by content categories:
    • Toponimia de Lugares (Place Names)
    • Curiosidades Históricas (Historical Curiosities)
    • Biografías de Personajes Históricos (Historical Biographies)
    • Arquitectura (Architecture)
    • Acontecimientos Históricos (Historical Events)
  4. Visualizations:
    • Scatter plots showing likes vs. comments correlation
    • Category-based engagement analysis
    • Interactive Plotly charts

Key Insights Example

Based on the analysis in the notebook:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load data
df = pd.read_excel('excel26deenero.xlsx')

# Descriptive statistics
print("Estadísticas Descriptivas:")
print(df[['Likes', 'Comentarios', 'Visualizaciones', 'Duración del video (s)']].describe())

# Top 5 videos by likes
top_5_likes = df[['Texto del reel', 'Likes', 'Comentarios', 'Visualizaciones']].sort_values(
    by='Likes', ascending=False
).head(5)

print("\nTop 5 videos con más Likes:")
print(top_5_likes)

Sample Statistics

The analysis reveals:
  • 121 total video posts analyzed
  • Average engagement: ~1,316 likes, ~39 comments, ~15,392 views
  • Top performing video: 14,659 likes, 361 comments, 255,191 views
  • Average video duration: ~50 seconds

Pro Tip

Use the category analysis to understand which historical topics resonate most with your audience and optimize content strategy accordingly.

Common Workflows

Complete Data Pipeline

Run the entire pipeline from scraping to visualization:
# 1. Scrape data
python scraping5.py

# 2. Generate map
python mapita5.py

# 3. Launch analysis
jupyter notebook EDA.ipynb

Update Existing Map

To update your map with new posts:
  1. Update the excel_info_1.xlsx file with new location data
  2. Re-run the map generator:
    python mapita5.py
    

Export Analysis Results

Export processed data for external use:
import pandas as pd

df = pd.read_excel('excel26deenero.xlsx')

# Export to CSV
df.to_csv('analysis_results.csv', index=False)

# Export summary statistics
df.describe().to_csv('summary_stats.csv')

Troubleshooting

Check if the Instagram profile name is correct and the profile is public. Instagram may also rate-limit requests.
Verify that coordinates in the Excel file are in the correct format: “latitude,longitude” (e.g., “28.5,-16.25”).
Ensure the imagenes/ directory exists and images were downloaded successfully. Check console output for download errors.
This may be due to large datasets. Try reducing the data size or increasing available memory.

Next Steps

Data Sources

Learn about data collection and management

API Reference

Explore detailed API documentation

Interactive Maps

Advanced mapping and visualization techniques

Analytics

Deep dive into engagement analytics

Build docs developers (and LLMs) love