Overview
Incremental updates allow you to refresh market data daily without re-fetching the entire historical dataset. The pipeline intelligently detects existing data and only fetches what’s new. Runtime: ~2-5 minutes (vs ~30 minutes for first-time full fetch)How Incremental Updates Work
Smart OHLCV Incremental Logic
Thefetch_all_ohlcv.py script implements intelligent incremental fetching:
Identify last recorded date
Reads the last row of each CSV to find the most recent date:Last date: 2026-03-02
Fetch only missing dates
Requests data from (last_date + 1 day) to today:Only fetches 1-2 days of new data instead of 500+ days.
Other Data Sources
Most other fetchers re-fetch fresh data daily (lightweight):| Script | Update Strategy | Runtime |
|---|---|---|
fetch_fundamental_data.py | Full refresh (quarterly data changes slowly) | ~18s |
fetch_company_filings.py | Fetches last 100 filings (new ones appear daily) | ~45s |
fetch_market_news.py | Fetches last 50 news items per stock | ~30s |
fetch_corporate_actions.py | Fetches upcoming + 2yr history | ~8s |
fetch_bulk_block_deals.py | Fetches last 30 days | ~5s |
fetch_circuit_stocks.py | Live snapshot (today’s circuits) | ~3s |
fetch_surveillance_lists.py | Current ASM/GSM lists | ~4s |
fetch_incremental_price_bands.py | Today’s band changes CSV | ~2s |
Running Daily Updates
Ensure OHLCV data exists
Verify the Should show ~2,775+ CSV files.
ohlcv_data/ directory from previous run:Run the full pipeline
Monitor incremental fetch
Watch Phase 2.5 output:First run: ~30 min (fetching 500+ days per stock)Incremental run: ~2-5 min (fetching 1-2 days per stock)
Performance Optimization
Adjust Thread Count
For faster incremental updates, increase parallelization: Editfetch_all_ohlcv.py line 14:
- Higher threads = faster execution
- Too many threads = rate limiting or connection errors
- Recommended range: 10-25 threads
Skip OHLCV for Quick Refresh
If you only need fundamental/event updates without price data: Editrun_full_pipeline.py line 64:
- ADR (Average Daily Range)
- RVOL (Relative Volume)
- ATH (All-Time High) and % from ATH
- All returns calculations (1D, 1W, 1M, 3M, 6M, 1Y)
FETCH_OHLCV = True later to backfill.
Selective Script Execution
If you only need specific data updated, run individual scripts:Update Only Fundamental Data
Update Only Technical Indicators
Update Only Events & News
Automated Daily Updates
Using Cron (Linux/Mac)
Schedule automatic execution after market close:Using systemd Timer (Linux)
For more control and better logging:Monitoring & Alerts
Log Analysis
The pipeline outputs structured logs. Parse for key metrics:Error Notifications
Send email if pipeline fails:Slack Webhook Integration
Notify Slack on completion:Data Validation
Verify Output Integrity
After each update, validate the output:Compare with Previous Run
Backup Strategy
Archive Previous Versions
Before each update, backup the previous output:OHLCV Data Backup
Theohlcv_data/ directory grows over time (~200 MB). Backup weekly:
Troubleshooting Incremental Updates
OHLCV Not Updating Incrementally
Symptom: Phase 2.5 still takes 30 minutes instead of 2-5 minutes Cause: CSV files may be corrupted or have incorrect last dates Solutions:-
Check a sample CSV for integrity:
Last row should have today’s or yesterday’s date.
-
Verify last date parsing:
-
If corrupted, delete specific CSV to re-fetch:
Missing Recent Data
Symptom: Latest quarter or news not showing in output Cause: Source API may not have published data yet Solutions:- Wait 1-2 hours after market close for data availability
- Check source manually (Dhan ScanX website)
- Re-run pipeline after delay
Stale Event Markers
Symptom: Old events still showing (e.g., “Results Recently Out” from 10 days ago) Cause: Event marker logic uses fixed time windows (7 days for results, 15 days for insider trading) Solution: This is expected behavior. Events auto-expire after their window:- Results: 7 days
- Insider Trading: 15 days
- Block Deals: 7 days
add_corporate_events.py logic.
Incremental Fetch Skipping Dates
Symptom: Some dates missing in OHLCV (e.g., 2026-03-03 present, 2026-03-04 missing) Cause: Market holiday or trading halt Solution: This is normal. OHLCV only contains trading days. Non-trading days (weekends, holidays) are automatically skipped.Best Practices for Incremental Updates
- Run once daily after market close (after 3:30 PM IST)
- Keep FETCH_OHLCV=True for continuous incremental updates
- Monitor first few incremental runs to ensure 2-5 min runtime
- Backup before first incremental run to test rollback
- Validate output after each run with automated checks
- Archive old outputs with date stamps for historical analysis
- Set up failure alerts to catch issues immediately
- Test manual execution before automating with cron/systemd
Next Steps
- Running Full Pipeline - First-time setup guide
- Single Stock Analysis - Analyze individual stocks
- Troubleshooting - Common errors and fixes