Error handling

The EDL Pipeline includes robust error handling to ensure data integrity and graceful degradation when individual scripts fail.

Error Handling Philosophy

The pipeline follows a fail-forward approach:

Critical failures (Phase 1) halt the pipeline
Non-critical failures (Phase 2+) log errors but continue execution
Enrichment failures (Phase 4) skip problematic stocks but complete the run

Critical vs Non-Critical Failures

Critical Failures (Pipeline Stops)

Phase 1: Core Data

fetch_dhan_data.py failure → No master_isin_map.json → STOP
bulk_market_analyzer.py failure → No base JSON → STOP

# From run_full_pipeline.py (lines 207-212)
results["fetch_dhan_data.py"] = run_script("fetch_dhan_data.py", "Phase 1")

if not results["fetch_dhan_data.py"]:
    print("\n🛑 CRITICAL: fetch_dhan_data.py failed. Cannot continue.")
    print("   This script produces master_isin_map.json which ALL other scripts need.")
    return

Non-Critical Failures (Pipeline Continues)

Phase 2: Enrichment

Individual enrichment scripts can fail without stopping the pipeline
Example: fetch_market_news.py fails → News fields will be empty, but pipeline completes

# From run_full_pipeline.py (lines 123-126)
if result.returncode == 0:
    print(f"  ✅ {script_name} ({elapsed:.1f}s)")
    return True
else:
    print(f"  ❌ {script_name} FAILED ({elapsed:.1f}s)")
    return True  # Continuing on enrichment errors to finish the job

Error Types and Solutions

1. Network Errors

Symptoms: requests.exceptions.ConnectionError, ReadTimeout, HTTPError Causes:

API endpoint temporarily down
Network connectivity issues
Rate limiting by Dhan/NSE servers

Solutions:

# Example from fetch_fundamental_data.py
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

# Add retry logic
session = requests.Session()
retry = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[500, 502, 503, 504]
)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

try:
    response = session.post(url, json=payload, headers=headers, timeout=30)
    response.raise_for_status()
except requests.exceptions.RequestException as e:
    print(f"Network error: {e}")
    # Continue with next stock or retry

2. Timeout Errors

Symptoms: Script hangs or subprocess.TimeoutExpired Causes:

API response delay
Large data transfer
System resource constraints

Pipeline timeout: 1800 seconds (30 minutes) per script

# From run_full_pipeline.py (lines 112-130)
try:
    result = subprocess.run(
        [sys.executable, script_path],
        cwd=BASE_DIR,
        text=True,
        timeout=1800  # 30 minutes
    )
except subprocess.TimeoutExpired:
    print(f"  ⏰ {script_name} TIMED OUT (>30 min)")
    return False

Fix: Increase timeout for slow scripts

# For fetch_all_ohlcv.py, extend to 3600 seconds (1 hour)
timeout=3600

3. Data Quality Errors

Symptoms: Missing fields, None values, type mismatches Example:

# From bulk_market_analyzer.py (lines 5-9)
def get_float(value_str):
    try:
        return float(value_str)
    except (ValueError, TypeError):
        return 0.0  # Safe fallback

Best Practices:

Use defensive .get() instead of direct key access
Provide sensible defaults (0 for numbers, "" for strings, [] for arrays)
Validate critical fields before processing

# Good: Safe field access
pe = stock.get('P/E', 0)
if pe > 0:
    # Process P/E
    
# Bad: Direct access (throws KeyError if missing)
pe = stock['P/E']  # May crash

4. File I/O Errors

Symptoms: FileNotFoundError, PermissionError, OSError Common Causes:

Missing input files (e.g., master_isin_map.json not created)
Disk space exhausted
Permission issues on output directory

Prevention:

import os

# Check input file exists before processing
if not os.path.exists(INPUT_FILE):
    print(f"Error: {INPUT_FILE} not found. Run fetch_dhan_data.py first.")
    return

# Create output directory if missing
if not os.path.exists(OUTPUT_DIR):
    os.makedirs(OUTPUT_DIR)

# Check disk space before writing large files
stat = os.statvfs(BASE_DIR)
free_space_mb = (stat.f_bavail * stat.f_frsize) / (1024 * 1024)
if free_space_mb < 500:  # Less than 500 MB free
    print(f"Warning: Low disk space ({free_space_mb:.1f} MB)")

5. JSON Parsing Errors

Symptoms: json.decoder.JSONDecodeError Causes:

Malformed API response
Incomplete file write (crashed mid-write)
Encoding issues

Handling:

import json

try:
    with open('data.json', 'r') as f:
        data = json.load(f)
except json.JSONDecodeError as e:
    print(f"JSON parsing error: {e}")
    print(f"Line {e.lineno}, Column {e.colno}")
    # Option 1: Skip file
    data = []
    # Option 2: Attempt manual repair
    # Option 3: Re-fetch data

Multi-threaded Error Handling

ThreadPoolExecutor Patterns

Many scripts use threading for parallel API calls:

from concurrent.futures import ThreadPoolExecutor, as_completed
import time

def fetch_with_retry(item, max_retries=3):
    """Fetch with exponential backoff retry."""
    for attempt in range(max_retries):
        try:
            # API call
            response = requests.post(url, json=payload, timeout=10)
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                print(f"Failed after {max_retries} attempts: {e}")
                return None
            # Exponential backoff: 1s, 2s, 4s
            time.sleep(2 ** attempt)
    return None

# Execute with thread pool
with ThreadPoolExecutor(max_workers=20) as executor:
    future_to_stock = {
        executor.submit(fetch_with_retry, item): item["Symbol"] 
        for item in stock_list
    }
    
    for future in as_completed(future_to_stock):
        symbol = future_to_stock[future]
        try:
            result = future.result()
            if result:
                success_count += 1
            else:
                error_count += 1
        except Exception as e:
            print(f"Unexpected error for {symbol}: {e}")
            error_count += 1

Logging and Diagnostics

Enable Detailed Logging

Add logging to scripts for better debugging:

import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('pipeline.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)

# Use in scripts
logger.info(f"Processing {symbol}")
logger.warning(f"Missing field 'P/E' for {symbol}")
logger.error(f"API call failed for {symbol}: {error}")

Check Pipeline Logs

# View real-time logs
tail -f pipeline.log

# Search for errors
grep ERROR pipeline.log

# Count failures by type
grep "Network error" pipeline.log | wc -l

Recovery Strategies

Partial Re-runs

If a Phase 2 script fails, you can re-run just that script:

# Re-fetch company filings without re-running full pipeline
python3 fetch_company_filings.py

# Then re-run enrichment phases
python3 advanced_metrics_processor.py
python3 add_corporate_events.py

Checkpoint-based Recovery

Modify scripts to skip already-processed items:

# Example: Skip stocks with existing files
output_path = f"{OUTPUT_DIR}/{symbol}_news.json"
if os.path.exists(output_path) and os.path.getsize(output_path) > 10:
    print(f"Skipping {symbol} (already exists)")
    return "skipped"

# Fetch and save
# ...

Enable checkpoint mode:

# In fetch_company_filings.py (line 12)
FORCE_UPDATE = False  # Skip existing files

Common Troubleshooting Scenarios

Scenario 1: “master_isin_map.json not found”

Cause: fetch_dhan_data.py failed or didn’t run Solution:

python3 fetch_dhan_data.py
# Check output
ls -lh master_isin_map.json

Scenario 2: “Empty fundamental_data.json”

Cause: API endpoint changed or rate limited Solution:

Check API endpoint in fetch_fundamental_data.py
Test API call manually with curl
Add delays between requests

Scenario 3: “Compression failed”

Cause: Disk full or corrupted JSON file Solution:

# Check disk space
df -h

# Validate JSON syntax
jq . all_stocks_fundamental_analysis.json > /dev/null

# Manual compression
gzip -9 all_stocks_fundamental_analysis.json

Scenario 4: “OHLCV data missing dates”

Cause: Market holiday or API gap Solution: OHLCV fetcher auto-fills gaps; verify date range:

import pandas as pd
df = pd.read_csv('ohlcv_data/RELIANCE.csv')
print(df['Date'].min(), df['Date'].max())

Monitoring and Alerts

Email Alerts on Failure

import smtplib
from email.mime.text import MIMEText

def send_alert(subject, body):
    msg = MIMEText(body)
    msg['Subject'] = subject
    msg['From'] = '[email protected]'
    msg['To'] = '[email protected]'
    
    with smtplib.SMTP('smtp.gmail.com', 587) as server:
        server.starttls()
        server.login('user', 'password')
        server.send_message(msg)

# In run_full_pipeline.py
if failed > 0:
    send_alert(
        'Pipeline Failures Detected',
        f'{failed} scripts failed. Check logs.'
    )

Health Check Script

#!/bin/bash
# check_pipeline_health.sh

FILE="all_stocks_fundamental_analysis.json.gz"
MAX_AGE_HOURS=36

if [ ! -f "$FILE" ]; then
    echo "ERROR: Output file missing"
    exit 1
fi

AGE=$(( $(date +%s) - $(stat -c %Y "$FILE") ))
AGE_HOURS=$(( AGE / 3600 ))

if [ $AGE_HOURS -gt $MAX_AGE_HOURS ]; then
    echo "WARNING: Data is $AGE_HOURS hours old"
    exit 1
fi

echo "OK: Data is fresh ($AGE_HOURS hours old)"
exit 0

Usage

Data Management

Advanced

Error Handling Philosophy

Critical vs Non-Critical Failures

Critical Failures (Pipeline Stops)

Non-Critical Failures (Pipeline Continues)

Error Types and Solutions

1. Network Errors

2. Timeout Errors

3. Data Quality Errors

4. File I/O Errors

5. JSON Parsing Errors

Multi-threaded Error Handling

ThreadPoolExecutor Patterns

Logging and Diagnostics

Enable Detailed Logging

Check Pipeline Logs

Recovery Strategies

Partial Re-runs

Checkpoint-based Recovery

Common Troubleshooting Scenarios

Scenario 1: “master_isin_map.json not found”

Scenario 2: “Empty fundamental_data.json”

Scenario 3: “Compression failed”

Scenario 4: “OHLCV data missing dates”

Monitoring and Alerts

Email Alerts on Failure

Health Check Script

Next Steps

Performance Tuning

Incremental Updates

Build docs developers (and LLMs) love

Usage

Data Management

Advanced

​Error Handling Philosophy

​Critical vs Non-Critical Failures

​Critical Failures (Pipeline Stops)

​Non-Critical Failures (Pipeline Continues)

​Error Types and Solutions

​1. Network Errors

​2. Timeout Errors

​3. Data Quality Errors

​4. File I/O Errors

​5. JSON Parsing Errors

​Multi-threaded Error Handling

​ThreadPoolExecutor Patterns

​Logging and Diagnostics

​Enable Detailed Logging

​Check Pipeline Logs

​Recovery Strategies

​Partial Re-runs

​Checkpoint-based Recovery

​Common Troubleshooting Scenarios

​Scenario 1: “master_isin_map.json not found”

​Scenario 2: “Empty fundamental_data.json”

​Scenario 3: “Compression failed”

​Scenario 4: “OHLCV data missing dates”

​Monitoring and Alerts

​Email Alerts on Failure

​Health Check Script

​Next Steps

Performance Tuning

Incremental Updates

Build docs developers (and LLMs) love

Error Handling Philosophy

Critical vs Non-Critical Failures

Critical Failures (Pipeline Stops)

Non-Critical Failures (Pipeline Continues)

Error Types and Solutions

1. Network Errors

2. Timeout Errors

3. Data Quality Errors

4. File I/O Errors

5. JSON Parsing Errors

Multi-threaded Error Handling

ThreadPoolExecutor Patterns

Logging and Diagnostics

Enable Detailed Logging

Check Pipeline Logs

Recovery Strategies

Partial Re-runs

Checkpoint-based Recovery

Common Troubleshooting Scenarios

Scenario 1: “master_isin_map.json not found”

Scenario 2: “Empty fundamental_data.json”

Scenario 3: “Compression failed”

Scenario 4: “OHLCV data missing dates”

Monitoring and Alerts

Email Alerts on Failure

Health Check Script

Next Steps