Troubleshooting

This page covers common issues you might encounter and their solutions.

Instagram Scraping Issues

Authentication Errors

Problem: Cannot access Instagram data or receive authentication errors. Symptoms:

Login error: Challenge required
HTTP Error 401: Unauthorized

Solutions:

Login with Instagram credentials:

import instaloader

L = instaloader.Instaloader()
L.login('your_username', 'your_password')

Use session file:
- Login once and save session
- Reuse session to avoid repeated logins
```
L.load_session_from_file('your_username')
```
Check Instagram restrictions:
- Instagram may limit API access
- Try again after a few hours
- Consider using a different account

Instagram frequently updates their security measures. If you encounter persistent authentication issues, check the Instaloader documentation for the latest solutions.

Rate Limiting

Problem: Scraping stops or slows down significantly. Symptoms:

Rate limit exceeded
Too many requests
429 Error

Solutions:

Add delays between requests:

import time

for post in profile.get_posts():
    if post.is_video:
        # Process post
        time.sleep(2)  # Wait 2 seconds between posts

Reduce request frequency:
- Scrape in smaller batches
- Run during off-peak hours
- Spread scraping across multiple sessions

Use Instaloader’s built-in rate limiting:

L = instaloader.Instaloader(
    sleep=True,  # Sleep between requests
    quiet=False,  # Show progress
    user_agent='Mozilla/5.0',
    max_connection_attempts=3
)

Missing Video Data

Problem: Some posts don’t have video URLs or duration. Symptoms:

url_video shows “Sin URL”
duracion_video shows “No disponible”

Solutions:

Verify post type:

if post.is_video and post.video_url:
    url_video = post.video_url
else:
    url_video = "Sin URL"

Handle private or expired content:

Some videos may be deleted or made private
Add error handling:

try:
    url_video = post.video_url
except Exception as e:
    print(f"Error getting video URL: {e}")
    url_video = "Sin URL"

Map Generation Issues

Coordinate Parsing Errors

Problem: Locations not appearing on the map. Symptoms:

Error al procesar la ubicación: invalid literal for float()

Solutions:

Check coordinate format:
- Must be: "latitude,longitude"
- Example: "28.1234,-15.5678"
- No spaces, comma-separated

Validate data before processing:

def obtener_coordenadas(localizacion):
    try:
        if pd.isna(localizacion):
            return None, None
        lat, lon = map(float, str(localizacion).split(','))
        return lat, lon
    except Exception as e:
        print(f"Error: {localizacion} - {e}")
        return None, None

Clean Excel data:
- Remove extra spaces
- Check for invalid characters
- Ensure numeric values

Image Download Failures

Problem: Thumbnail images not displaying in map popups. Symptoms:

No se pudo descargar la imagen
HTTP Error 404

Solutions:

Check image URLs:
- Verify URLs are valid and accessible
- Instagram CDN links may expire

Add retry logic:

import time

def descargar_imagen(url, index, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.get(url, stream=True, timeout=10)
            if response.status_code == 200:
                ruta_imagen = f"imagenes/imagen_{index}.jpg"
                with open(ruta_imagen, 'wb') as file:
                    for chunk in response.iter_content(1024):
                        file.write(chunk)
                return ruta_imagen
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            time.sleep(2)
    return None

Use fallback images:
- Provide a default placeholder image
- Skip markers without images

Map Not Loading

Problem: HTML file opens but map doesn’t display. Symptoms:

Blank page
JavaScript console errors

Solutions:

Check file paths:
- Ensure imagenes/ folder is in the same directory as HTML
- Use relative paths for images
Verify Folium installation:
```
pip install --upgrade folium
```

Test with simple map:

import folium

m = folium.Map(location=[28.0, -15.0], zoom_start=6)
m.save("test_map.html")

Dependency Issues

Missing Package Errors

Problem: Import errors when running scripts. Symptoms:

ModuleNotFoundError: No module named 'instaloader'
ImportError: cannot import name 'xxx'

Solutions:

Install all requirements:

pip install instaloader pandas folium requests matplotlib seaborn plotly openpyxl

Use virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Check Python version:

python --version  # Should be 3.8 or higher

Excel File Reading Errors

Problem: Cannot read .xlsx files. Symptoms:

ImportError: Missing optional dependency 'openpyxl'

Solutions:

Install openpyxl:
```
pip install openpyxl
```

Alternative - convert to CSV:

# Read CSV instead
df = pd.read_csv('data.csv')

Data Analysis Issues

Jupyter Notebook Not Starting

Problem: Cannot open .ipynb files. Solutions:

Install Jupyter:

pip install jupyter notebook
jupyter notebook

Use JupyterLab:
```
pip install jupyterlab
jupyter lab
```
Use VS Code:
- Install Python extension
- Open .ipynb files directly

Plotting Errors

Problem: Visualizations not displaying. Solutions:

Enable inline plotting:
```
%matplotlib inline
```

Update plotting libraries:

pip install --upgrade matplotlib seaborn plotly

Check backend:

import matplotlib
matplotlib.use('Agg')  # For non-interactive backend

Performance Issues

Slow Scraping

Problem: Scraping takes too long. Solutions:

Limit post count:

from itertools import islice

posts = islice(profile.get_posts(), 100)  # Only first 100 posts

Skip non-video posts early:

for post in profile.get_posts():
    if not post.is_video:
        continue
    # Process only videos

Use multiprocessing:
- Process multiple posts in parallel (advanced)
- Be careful with rate limits

Large File Sizes

Problem: HTML map or CSV files are too large. Solutions:

Compress images:

from PIL import Image

img = Image.open(ruta_imagen)
img.save(ruta_imagen, quality=70, optimize=True)

Limit data:
- Filter by date range
- Select top N posts
- Remove unnecessary columns
Use external image hosting:
- Link to Instagram URLs directly
- Don’t download thumbnails locally

Getting Help

If you encounter issues not covered here:

Check documentation:
Search existing issues:
- GitHub Issues for each library
- Stack Overflow

Enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

For project-specific issues, review the source code in scraping5.py and mapita5.py to understand the exact implementation.

Getting Started

Data Collection

Analysis & Visualization

Interactive Maps

Data Processing

Reference

Instagram Scraping Issues

Authentication Errors

Rate Limiting

Missing Video Data

Map Generation Issues

Coordinate Parsing Errors

Image Download Failures

Map Not Loading

Dependency Issues

Missing Package Errors

Excel File Reading Errors

Data Analysis Issues

Jupyter Notebook Not Starting

Plotting Errors

Performance Issues

Slow Scraping

Large File Sizes

Getting Help

Build docs developers (and LLMs) love

Getting Started

Data Collection

Analysis & Visualization

Interactive Maps

Data Processing

Reference

​Instagram Scraping Issues

​Authentication Errors

​Rate Limiting

​Missing Video Data

​Map Generation Issues

​Coordinate Parsing Errors

​Image Download Failures

​Map Not Loading

​Dependency Issues

​Missing Package Errors

​Excel File Reading Errors

​Data Analysis Issues

​Jupyter Notebook Not Starting

​Plotting Errors

​Performance Issues

​Slow Scraping

​Large File Sizes

​Getting Help

Build docs developers (and LLMs) love

Instagram Scraping Issues

Authentication Errors

Rate Limiting

Missing Video Data

Map Generation Issues

Coordinate Parsing Errors

Image Download Failures

Map Not Loading

Dependency Issues

Missing Package Errors

Excel File Reading Errors

Data Analysis Issues

Jupyter Notebook Not Starting

Plotting Errors

Performance Issues

Slow Scraping

Large File Sizes

Getting Help