Skip to main content
These two scripts download complete historical Open, High, Low, Close, Volume (OHLCV) data for all NSE stocks and indices. They support smart incremental updates and are optimized for minimal API load.

Overview

ScriptOutput DirectoryRecordsUpdate Mode
fetch_all_ohlcv.pyohlcv_data/~2,775 CSV filesIncremental (2-5 min)
fetch_indices_ohlcv.pyindices_ohlcv_data/~194 CSV filesIncremental (30-60 sec)

fetch_all_ohlcv.py

What It Does

Fetches lifetime daily OHLCV data for all NSE stocks, starting from the earliest available date (some stocks go back to 1990s). Uses smart incremental updates to only fetch missing dates.

API Endpoint

URL: https://openweb-ticks.dhan.co/getDataH
Method: POST

Payload:
{
  "EXCH": "NSE",
  "SYM": "RELIANCE",
  "SEG": "E",
  "INST": "EQUITY",
  "SEC_ID": "2885",
  "EXPCODE": 0,
  "INTERVAL": "D",          // Daily candles
  "START": 215634600,        // Oct 31, 1976 (forces max history)
  "END": 1735689600          // Current timestamp
}
Parameters Explained:
  • START: Unix timestamp (default: 215634600 = Oct 31, 1976)
    • This forces the API to return all available history
  • INTERVAL: D for daily candles (use W for weekly, M for monthly)
  • SEC_ID: Security ID from master_isin_map.json (created by fetch_dhan_data.py)

Smart Incremental Logic

1

Check Existing Data

Reads the last date from ohlcv_data/{SYMBOL}.csv
last_date = rows[-1]["Date"]  # e.g., "2024-01-15"
target_start = int(last_dt.timestamp()) + 86400
2

Fetch Missing Chunks

Downloads data in 180-day chunks from last date to today:
CHUNK_DAYS = 180
while chunk_ptr > target_start:
    c_start = max(target_start, chunk_ptr - (180 * 86400))
    fetch_chunk(c_start, chunk_ptr)
    chunk_ptr = c_start - 86400
3

Merge Live Data

Fetches today’s live snapshot from ScanX API:
live_snapshot = get_live_snapshots()  # Real-time OHLC
today_row = {
    'Date': '2024-03-03',
    'Open': snapshot['Open'],
    'High': snapshot['High'],
    'Low': snapshot['Low'],
    'Close': snapshot['Ltp'],
    'Volume': snapshot['Volume']
}
4

Deduplicate & Save

Merges historical + live data, removes duplicates by date:
merged = {r['Date']: r for r in existing_rows + new_rows}
final_rows = sorted(merged.values(), key=lambda x: x['Date'])

Output Format

Directory: ohlcv_data/ Files: One CSV per stock (e.g., RELIANCE.csv, TCS.csv)
Date,Open,High,Low,Close,Volume
2020-01-01,1450.50,1468.30,1442.10,1461.75,8234567
2020-01-02,1463.20,1475.80,1458.40,1472.35,7654321
2020-01-03,1474.10,1489.60,1471.25,1485.90,9123456
...
2024-03-03,2543.80,2567.20,2538.50,2555.40,6789012
Volume = 0 for some older dates (pre-2010) where NSE did not publish volume data.

Configuration

Edit these constants in fetch_all_ohlcv.py (lines 11-16):
CHUNK_DAYS = 180     # Fetch in 180-day chunks (reduce for slower APIs)
MAX_THREADS = 15     # Concurrent downloads (increase if API is fast)
SCANX_URL = "https://ow-scanx-analytics.dhan.co/customscan/fetchdt"
TICK_API_URL = "https://openweb-ticks.dhan.co/getDataH"

Performance Benchmarks

ScenarioTime TakenAPI Calls
First Run (Full History)~25-35 min~8,000 chunks
Daily Update (Next Day)~2-3 min~2,775 stocks
Weekly Update (7 Days)~2-5 min~2,775 stocks
Monthly Update (30 Days)~3-6 min~5,500 chunks
Run this script once per day after market close (3:45 PM IST) to keep data fresh.

Usage

python3 fetch_all_ohlcv.py
Output (First Run):
Fetching live snapshots for stocks (Today's data)...
Syncing OHLCV for 2775 stocks (Hybrid Multi-Chunk Mode)...
Done! Updated: 2753 | UpToDate: 22 | Errors: 0
Output (Incremental Update):
Fetching live snapshots for stocks (Today's data)...
Syncing OHLCV for 2775 stocks (Hybrid Multi-Chunk Mode)...
Done! Updated: 2775 | UpToDate: 0 | Errors: 0

Error Handling

def fetch_history_chunk(payload):
    try:
        response = requests.post(TICK_API_URL, json=payload, headers=get_headers(), timeout=15)
        if response.status_code == 200:
            data = response.json().get("data", {})
            # Parse OHLCV arrays
            return rows
    except:
        pass  # Silent fail, counted in errors
    return []
If a stock has no data (newly listed or delisted), the CSV will not be created. Check Errors count in output.

fetch_indices_ohlcv.py

What It Does

Fetches lifetime daily OHLCV data for all NSE indices (NIFTY 50, NIFTY Bank, etc.) with the same incremental update logic.

Differences from Stock OHLCV

Featurefetch_all_ohlcv.pyfetch_indices_ohlcv.py
Input Filedhan_data_response.jsonall_indices_list.json
Output Directoryohlcv_data/indices_ohlcv_data/
Threads1560 (indices are faster)
Chunk Size180 days120 days
Volume DataReal trading volume0 (indices don’t have volume)
FilenameRELIANCE.csvNifty_50.csv (sanitized)

API Endpoint

URL: https://openweb-ticks.dhan.co/getDataH
Method: POST

Payload:
{
  "EXCH": "IDX",
  "SYM": "Nifty 50",
  "SEG": "IDX",
  "INST": "IDX",
  "SEC_ID": "13",
  "EXPCODE": 0,
  "INTERVAL": "D",
  "START": 215634600,
  "END": 1735689600
}

Filename Sanitization

Index names contain spaces/special chars, so filenames are sanitized:
def get_safe_sym(sym):
    return "".join([c if c.isalnum() else "_" for c in sym])

Examples:
"Nifty 50"           → Nifty_50.csv
"Nifty Bank"         → Nifty_Bank.csv
"NIFTY Alpha 50"NIFTY_Alpha_50.csv

Output Format

Directory: indices_ohlcv_data/ Example: Nifty_50.csv
Date,Open,High,Low,Close,Volume
2015-01-01,8282.70,8328.50,8276.95,8314.20,0
2015-01-02,8314.20,8343.80,8298.40,8334.70,0
2015-01-03,8334.70,8375.15,8312.90,8368.50,0
...
2024-03-03,21450.30,21523.80,21398.50,21487.65,0
Volume is always 0 for indices because NSE does not publish index volume. For constituent volumes, use stock OHLCV data.

Configuration

Edit these constants in fetch_indices_ohlcv.py (lines 18-19):
CHUNK_DAYS = 120     # Smaller chunks for faster APIs
MAX_THREADS = 60     # More threads (indices are lightweight)

Performance Benchmarks

ScenarioTime TakenAPI Calls
First Run (Full History)~30-60 sec~600 chunks
Daily Update~10-15 sec~194 indices
Weekly Update~15-20 sec~300 chunks

Usage

python3 fetch_indices_ohlcv.py
Output:
Checking 194 indices for sync...
Executing 612 API chunks for history...
Merging with Live Snapshots and saving CSVs...
Successfully updated all index CSVs with Today's Live data.

Pipeline Integration

Both scripts are optionally included in the pipeline based on the FETCH_OHLCV flag.

Enable OHLCV in Pipeline

Edit run_full_pipeline.py (line 64):
# OHLCV: Auto-detect mode
# True = always fetch (incremental update: ~2-5 min if data exists)
# False = skip entirely (ADR, RVOL, ATH fields will be 0)
FETCH_OHLCV = True  # Change from False to True

Pipeline Execution (Phase 2.5)

python3 run_full_pipeline.py
Output:
📊 PHASE 2.5: OHLCV History (Smart Incremental)
────────────────────────────────────────────────
  ▶ Running fetch_all_ohlcv.py...
  ✅ fetch_all_ohlcv.py (142.3s)
  ▶ Running fetch_indices_ohlcv.py...
  ✅ fetch_indices_ohlcv.py (18.7s)
If FETCH_OHLCV = False, Phase 2.5 is skipped, and fields like ADR, RVOL, % from ATH will be 0 in the final JSON.

Why OHLCV Data is Critical

The OHLCV CSVs power 14 advanced fields in the final output:
Calculated from High/Low ranges:
  • 5 Days MA ADR(%)
  • 14 Days MA ADR(%)
  • 20 Days MA ADR(%)
  • 30 Days MA ADR(%)
Formula:
ADR = ((High - Low) / Close) * 100
5-Day MA ADR = Average of last 5 days' ADR
Calculated from Volume column:
  • RVOL (Relative Volume vs 20-day avg)
  • 200 Days EMA Volume
  • % from 52W High 200 Days EMA Volume
  • Daily Rupee Turnover 20/50/100(Cr.)
  • 30 Days Average Rupee Volume(Cr.)
Formula:
RVOL = Today's Volume / Avg(Last 20 Days Volume)
Rupee Turnover = Volume * Close / 10,000,000
  • % from ATH (Distance from all-time high)
  • Implicitly used in Returns since Earnings(%) calculation
Formula:
ATH = max(Close for all dates)
% from ATH = ((Current Price - ATH) / ATH) * 100
Requires OHLCV to calculate post-earnings returns:
  • Returns since Earnings(%)
  • Max Returns since Earnings(%)
Formula:
Earnings Date = "2024-01-20"
Pre-Earnings Close = OHLCV["2024-01-19"]["Close"]
Returns = ((Current Price - Pre-Earnings Close) / Pre-Earnings Close) * 100
Cross-validates SMA/EMA levels from fetch_advanced_indicators.py
If OHLCV data is missing, advanced_metrics_processor.py and process_earnings_performance.py will set these fields to 0 or N/A.

Data Quality Checks

Both scripts validate data integrity:
# 1. Date Format Validation
dt_str = t if isinstance(t, str) else datetime.fromtimestamp(t).strftime("%Y-%m-%d")

# 2. Volume Sanitization (no negative values)
if isinstance(vol, (int, float)) and vol < 0: 
    vol = 0

# 3. Deduplication by Date
merged = {r['Date']: r for r in existing_rows + new_rows}

# 4. Chronological Sorting
final_rows = sorted(merged.values(), key=lambda x: x['Date'])

Storage Requirements

DatasetFilesAvg Size/FileTotal Size
Stocks OHLCV~2,77550-150 KB~300-400 MB
Indices OHLCV~19430-100 KB~10-20 MB
Total~2,969-~320-420 MB
If using CLEANUP_INTERMEDIATE = True in the pipeline, OHLCV data is preserved after cleanup (it’s not considered intermediate).

Troubleshooting

Cause: fetch_dhan_data.py has not been run.Solution:
python3 fetch_dhan_data.py
python3 fetch_all_ohlcv.py
Cause: fetch_all_indices.py has not been run.Solution:
python3 fetch_all_indices.py
python3 fetch_indices_ohlcv.py
Cause: API rate limiting or network issues.Solution:
  • Reduce MAX_THREADS from 15 to 5
  • Increase timeout from 15s to 30s
  • Re-run script (incremental mode will fill gaps)
This is normal. NSE only publishes data for trading days. Use:
import pandas as pd
df = pd.read_csv('RELIANCE.csv', parse_dates=['Date'])
df = df.set_index('Date').asfreq('D', method='ffill')  # Forward-fill

Next Steps

F&O Data Scripts

Fetch Futures & Options data

Indices & ETF Scripts

Fetch market indices and ETF data

Pipeline Flags

Configure FETCH_OHLCV and FETCH_OPTIONAL

Advanced Metrics

Learn how OHLCV powers ADR, RVOL, ATH

Build docs developers (and LLMs) love