Error Handling Philosophy
The pipeline follows a fail-forward approach:- Critical failures (Phase 1) halt the pipeline
- Non-critical failures (Phase 2+) log errors but continue execution
- Enrichment failures (Phase 4) skip problematic stocks but complete the run
Critical vs Non-Critical Failures
Critical Failures (Pipeline Stops)
Phase 1: Core Datafetch_dhan_data.pyfailure → Nomaster_isin_map.json→ STOPbulk_market_analyzer.pyfailure → No base JSON → STOP
Non-Critical Failures (Pipeline Continues)
Phase 2: Enrichment- Individual enrichment scripts can fail without stopping the pipeline
- Example:
fetch_market_news.pyfails → News fields will be empty, but pipeline completes
Error Types and Solutions
1. Network Errors
Symptoms:requests.exceptions.ConnectionError, ReadTimeout, HTTPError
Causes:
- API endpoint temporarily down
- Network connectivity issues
- Rate limiting by Dhan/NSE servers
2. Timeout Errors
Symptoms: Script hangs orsubprocess.TimeoutExpired
Causes:
- API response delay
- Large data transfer
- System resource constraints
3. Data Quality Errors
Symptoms: Missing fields,None values, type mismatches
Example:
- Use defensive
.get()instead of direct key access - Provide sensible defaults (0 for numbers, "" for strings, [] for arrays)
- Validate critical fields before processing
4. File I/O Errors
Symptoms:FileNotFoundError, PermissionError, OSError
Common Causes:
- Missing input files (e.g.,
master_isin_map.jsonnot created) - Disk space exhausted
- Permission issues on output directory
5. JSON Parsing Errors
Symptoms:json.decoder.JSONDecodeError
Causes:
- Malformed API response
- Incomplete file write (crashed mid-write)
- Encoding issues
Multi-threaded Error Handling
ThreadPoolExecutor Patterns
Many scripts use threading for parallel API calls:Logging and Diagnostics
Enable Detailed Logging
Add logging to scripts for better debugging:Check Pipeline Logs
Recovery Strategies
Partial Re-runs
If a Phase 2 script fails, you can re-run just that script:Checkpoint-based Recovery
Modify scripts to skip already-processed items:Common Troubleshooting Scenarios
Scenario 1: “master_isin_map.json not found”
Cause:fetch_dhan_data.py failed or didn’t run
Solution:
Scenario 2: “Empty fundamental_data.json”
Cause: API endpoint changed or rate limited Solution:- Check API endpoint in
fetch_fundamental_data.py - Test API call manually with curl
- Add delays between requests
Scenario 3: “Compression failed”
Cause: Disk full or corrupted JSON file Solution:Scenario 4: “OHLCV data missing dates”
Cause: Market holiday or API gap Solution: OHLCV fetcher auto-fills gaps; verify date range:Monitoring and Alerts
Email Alerts on Failure
Health Check Script
Next Steps
Performance Tuning
Optimize threading, batching, and timeouts
Incremental Updates
Set up automated daily updates with monitoring