Understand the OHLCV data format, how to load and access price data in strategies, and how bar-by-bar revelation works during a backtest.
backtesting.py uses standard OHLCV (Open, High, Low, Close, Volume) data stored as a pandas DataFrame. Understanding how data is structured and how it is accessed inside a strategy is essential for building correct strategies.
Column names are case-sensitive. If Volume is unavailable, it defaults to NaN internally.The DataFrame index should be a pd.DatetimeIndex. A monotonic pd.RangeIndex is also accepted but produces a warning.
NaN values in the OHLC columns will cause Backtest.__init__ to raise a ValueError. Strip them with df.dropna() or fill them with df.interpolate() before passing to Backtest.
self.data.pip returns the smallest price unit of change, computed as 10 ** -decimal_places based on the Close column. Useful for setting stop-loss distances:
Data access behaves differently in init() versus next(), which is the key mechanism that prevents look-ahead bias.
In init()
self.data arrays are available at full length — all bars from start to end. This is required so indicator libraries can compute their rolling windows.
def init(self): # full Close array available here self.sma = self.I(SMA, self.data.Close, 20)
In next()
self.data arrays are sliced to the current bar. Only the current bar and all prior bars are visible. The last element [-1] is always the current bar.
def next(self): # only sees data up to current bar current = self.data.Close[-1] prior = self.data.Close[-2]
The same bar-by-bar slicing applies to all indicator arrays declared with self.I(). This ensures the strategy can only act on information that was available at the time, making the simulation realistic.
# Remove rows with any NaN in OHLC columnsdf = df.dropna(subset=['Open', 'High', 'Low', 'Close'])# Fill gaps with linear interpolationdf[['Open', 'High', 'Low', 'Close']] = ( df[['Open', 'High', 'Low', 'Close']].interpolate())# Sort by date if not already sorteddf = df.sort_index()
Always sort your data by date before passing it to Backtest. If the index is not monotonically increasing, Backtest will sort it automatically and emit a warning.