backtest_data

sidebar_position: 3

Backtest Data

Learn how to retrieve and work with backtest data for analysis, visualization, and strategy development.

Overview

When developing trading strategies, you often need access to the raw market data used during backtesting. This is particularly important for:

Visualizing strategy signals overlaid on price charts
Debugging indicator calculations that depend on warmup periods
Creating custom analysis reports with price data and indicators
Validating data quality before running backtests

The get_backtest_data method provides a convenient way to retrieve all data sources with their corresponding data for a given strategy and backtest window.

The `get_backtest_data` Method

Method Signature

def get_backtest_data(
    self,
    strategy: TradingStrategy,
    backtest_date_range: BacktestDateRange,
    show_progress: bool = True,
    fill_missing_data: bool = True,
) -> Dict[str, Any]

Parameters

Parameter	Type	Default	Description
`strategy`	`TradingStrategy`	Required	The strategy containing the data sources to retrieve data for.
`backtest_date_range`	`BacktestDateRange`	Required	The date range for the backtest window.
`show_progress`	`bool`	`True`	Whether to show progress bars during data retrieval.
`fill_missing_data`	`bool`	`True`	If `True`, missing time series data entries will be filled automatically.

Returns

A dictionary where:

Keys are data source identifiers (e.g., "btc_data", "eth_eur_1h")
Values are the corresponding data (typically pandas DataFrames)

Why This Matters for Quant Developers

1. Warmup Window Handling

Many technical indicators require historical data to "warm up" before producing valid signals. For example:

A 200-period moving average needs 200 data points before it can be calculated
RSI typically uses 14 periods of data
MACD requires even more historical data for its components

When you define a warmup_window in your DataSource, the get_backtest_data method automatically includes this additional historical data. This means the data you retrieve for visualization will match exactly what your strategy sees during backtesting.

DataSource(
    identifier="btc_data",
    symbol="BTC/EUR",
    time_frame="1h",
    warmup_window=200,  # Strategy needs 200 candles for indicators
    market="BITVAVO"
)

2. Consistent Data for Visualization

When creating charts that overlay indicators on price data, you need the exact same data your strategy used. The get_backtest_data method ensures consistency by:

Using the same data providers as the backtest
Applying the same warmup window calculations
Handling missing data the same way

3. Debugging and Validation

Before running a full backtest, you can inspect the data to:

Verify data quality and completeness
Check that indicators calculate correctly
Validate that signals appear at expected times

Basic Usage

Simple Example

from investing_algorithm_framework import (
    create_app, TradingStrategy, BacktestDateRange, 
    DataSource, TimeUnit, PortfolioConfiguration
)
from datetime import datetime, timezone

# Define your strategy with data sources
class MyStrategy(TradingStrategy):
    time_unit = TimeUnit.HOUR
    interval = 4
    
    data_sources = [
        DataSource(
            identifier="btc_data",
            symbol="BTC/EUR",
            time_frame="4h",
            warmup_window=100,
            market="BITVAVO"
        )
    ]
    
    def generate_buy_signals(self, data):
        # Strategy logic here
        pass
    
    def generate_sell_signals(self, data):
        # Strategy logic here
        pass

# Create and configure the app
app = create_app()
app.add_portfolio_configuration(
    PortfolioConfiguration(
        initial_balance=10000,
        market="BITVAVO",
        trading_symbol="EUR"
    )
)

# Define backtest range
backtest_range = BacktestDateRange(
    start_date=datetime(2024, 1, 1, tzinfo=timezone.utc),
    end_date=datetime(2024, 6, 1, tzinfo=timezone.utc)
)

# Get the backtest data
data = app.get_backtest_data(
    strategy=MyStrategy(),
    backtest_date_range=backtest_range
)

# Access the data by identifier
btc_df = data["btc_data"]
print(f"Data shape: {btc_df.shape}")
print(f"Date range: {btc_df.index.min()} to {btc_df.index.max()}")
print(btc_df.head())

Multiple Data Sources

class MultiAssetStrategy(TradingStrategy):
    time_unit = TimeUnit.HOUR
    interval = 1
    
    data_sources = [
        DataSource(
            identifier="btc_1h",
            symbol="BTC/EUR",
            time_frame="1h",
            warmup_window=200,
            market="BITVAVO"
        ),
        DataSource(
            identifier="eth_1h",
            symbol="ETH/EUR",
            time_frame="1h",
            warmup_window=200,
            market="BITVAVO"
        ),
        DataSource(
            identifier="btc_4h",
            symbol="BTC/EUR",
            time_frame="4h",
            warmup_window=50,
            market="BITVAVO"
        )
    ]

# Get all data sources at once
data = app.get_backtest_data(
    strategy=MultiAssetStrategy(),
    backtest_date_range=backtest_range
)

# Access each data source
btc_hourly = data["btc_1h"]
eth_hourly = data["eth_1h"]
btc_4hourly = data["btc_4h"]

Visualization Examples

Plotting Price with Moving Averages

import matplotlib.pyplot as plt
import pandas as pd

# Get the data
data = app.get_backtest_data(
    strategy=MyStrategy(),
    backtest_date_range=backtest_range
)

df = data["btc_data"]

# Calculate indicators (same as your strategy would)
df["SMA_20"] = df["Close"].rolling(window=20).mean()
df["SMA_50"] = df["Close"].rolling(window=50).mean()
df["SMA_200"] = df["Close"].rolling(window=200).mean()

# Plot
fig, ax = plt.subplots(figsize=(14, 7))

ax.plot(df.index, df["Close"], label="BTC/EUR", alpha=0.7)
ax.plot(df.index, df["SMA_20"], label="SMA 20", linewidth=1)
ax.plot(df.index, df["SMA_50"], label="SMA 50", linewidth=1)
ax.plot(df.index, df["SMA_200"], label="SMA 200", linewidth=1)

ax.set_title("BTC/EUR with Moving Averages")
ax.set_xlabel("Date")
ax.set_ylabel("Price (EUR)")
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Visualizing Buy/Sell Signals

import matplotlib.pyplot as plt

# Get backtest data
data = app.get_backtest_data(
    strategy=MyStrategy(),
    backtest_date_range=backtest_range
)

df = data["btc_data"]

# Calculate your indicators
short_ma = df["Close"].rolling(window=20).mean()
long_ma = df["Close"].rolling(window=50).mean()

# Generate signals (same logic as your strategy)
buy_signals = (short_ma > long_ma) & (short_ma.shift(1) <= long_ma.shift(1))
sell_signals = (short_ma < long_ma) & (short_ma.shift(1) >= long_ma.shift(1))

# Plot
fig, ax = plt.subplots(figsize=(14, 7))

ax.plot(df.index, df["Close"], label="Price", alpha=0.7)
ax.plot(df.index, short_ma, label="MA 20", linewidth=1)
ax.plot(df.index, long_ma, label="MA 50", linewidth=1)

# Mark buy signals
buy_dates = df.index[buy_signals]
buy_prices = df.loc[buy_signals, "Close"]
ax.scatter(buy_dates, buy_prices, marker="^", color="green", 
           s=100, label="Buy Signal", zorder=5)

# Mark sell signals
sell_dates = df.index[sell_signals]
sell_prices = df.loc[sell_signals, "Close"]
ax.scatter(sell_dates, sell_prices, marker="v", color="red", 
           s=100, label="Sell Signal", zorder=5)

ax.set_title("Strategy Signals Visualization")
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Creating a Multi-Panel Chart

import matplotlib.pyplot as plt
from pyindicators import rsi, ema

# Get data
data = app.get_backtest_data(
    strategy=MyStrategy(),
    backtest_date_range=backtest_range
)

df = data["btc_data"]

# Calculate indicators
df = ema(df, period=20, source_column="Close", result_column="EMA_20")
df = ema(df, period=50, source_column="Close", result_column="EMA_50")
df = rsi(df, period=14, source_column="Close", result_column="RSI")

# Create multi-panel figure
fig, axes = plt.subplots(3, 1, figsize=(14, 10), 
                          gridspec_kw={'height_ratios': [3, 1, 1]})

# Price panel
axes[0].plot(df.index, df["Close"], label="Price", alpha=0.7)
axes[0].plot(df.index, df["EMA_20"], label="EMA 20")
axes[0].plot(df.index, df["EMA_50"], label="EMA 50")
axes[0].set_title("BTC/EUR Price with EMAs")
axes[0].legend(loc="upper left")
axes[0].grid(True, alpha=0.3)

# Volume panel
axes[1].bar(df.index, df["Volume"], alpha=0.7, color="steelblue")
axes[1].set_title("Volume")
axes[1].grid(True, alpha=0.3)

# RSI panel
axes[2].plot(df.index, df["RSI"], color="purple")
axes[2].axhline(y=70, color="red", linestyle="--", alpha=0.5)
axes[2].axhline(y=30, color="green", linestyle="--", alpha=0.5)
axes[2].fill_between(df.index, 30, 70, alpha=0.1)
axes[2].set_title("RSI (14)")
axes[2].set_ylim(0, 100)
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Integration with Jupyter Notebooks

The get_backtest_data method is particularly useful in Jupyter notebooks for interactive analysis:

# In a Jupyter notebook cell
from investing_algorithm_framework import create_app, BacktestDateRange
from datetime import datetime, timezone

# Setup
app = create_app()
# ... configure app ...

# Get data for analysis
data = app.get_backtest_data(
    strategy=my_strategy,
    backtest_date_range=BacktestDateRange(
        start_date=datetime(2024, 1, 1, tzinfo=timezone.utc),
        end_date=datetime(2024, 12, 31, tzinfo=timezone.utc)
    )
)

# Explore in notebook
df = data["btc_data"]
df.describe()

Comparing Backtest Results with Raw Data

A powerful use case is comparing your backtest results with the underlying data:

# Run backtest
backtest = app.run_vector_backtest(
    strategy=my_strategy,
    backtest_date_range=backtest_range,
    initial_amount=10000
)

# Get the same data used in backtest
data = app.get_backtest_data(
    strategy=my_strategy,
    backtest_date_range=backtest_range
)

# Get trades from backtest
trades = backtest.get_trades()

# Plot trades on price chart
df = data["btc_data"]

fig, ax = plt.subplots(figsize=(14, 7))
ax.plot(df.index, df["Close"], label="Price", alpha=0.7)

for trade in trades:
    # Mark entry
    ax.axvline(x=trade.opened_at, color="green", alpha=0.3, linestyle="--")
    # Mark exit
    if trade.closed_at:
        ax.axvline(x=trade.closed_at, color="red", alpha=0.3, linestyle="--")

ax.set_title("Backtest Trades on Price Chart")
ax.legend()
plt.show()

Best Practices

1. Use Consistent Warmup Windows

Ensure your visualization uses the same warmup_window as your strategy:

# In your strategy
data_sources = [
    DataSource(
        identifier="btc_data",
        warmup_window=200,  # Enough for 200-period indicators
        ...
    )
]

2. Handle Missing Data Appropriately

The fill_missing_data parameter controls whether gaps in the data are filled:

# Fill missing data (default)
data = app.get_backtest_data(
    strategy=my_strategy,
    backtest_date_range=backtest_range,
    fill_missing_data=True
)

# Keep raw data with gaps (for data quality analysis)
data = app.get_backtest_data(
    strategy=my_strategy,
    backtest_date_range=backtest_range,
    fill_missing_data=False
)

3. Cache Data for Repeated Analysis

If you're doing multiple analyses on the same data:

# Fetch once
data = app.get_backtest_data(
    strategy=my_strategy,
    backtest_date_range=backtest_range
)

# Save to disk for later use
data["btc_data"].to_csv("btc_backtest_data.csv")

# Or use pickle for faster loading
import pickle
with open("backtest_data.pkl", "wb") as f:
    pickle.dump(data, f)

Error Handling

The method will raise an OperationalException if:

No data sources are defined in the strategy
Data cannot be retrieved for a data source

from investing_algorithm_framework import OperationalException

try:
    data = app.get_backtest_data(
        strategy=my_strategy,
        backtest_date_range=backtest_range
    )
except OperationalException as e:
    print(f"Error retrieving data: {e}")

Next Steps

Learn about Data Sources to configure your data inputs
Explore Backtesting to run full strategy tests
Check out Trading Strategies to build your trading logic

sidebar_position: 3​

Backtest Data

Overview​

The get_backtest_data Method​

Method Signature​

Parameters​

Returns​

Why This Matters for Quant Developers​

1. Warmup Window Handling​

2. Consistent Data for Visualization​

3. Debugging and Validation​

Basic Usage​

Simple Example​

Multiple Data Sources​

Visualization Examples​

Plotting Price with Moving Averages​

Visualizing Buy/Sell Signals​

Creating a Multi-Panel Chart​

Integration with Jupyter Notebooks​

Comparing Backtest Results with Raw Data​

Best Practices​

1. Use Consistent Warmup Windows​

2. Handle Missing Data Appropriately​

3. Cache Data for Repeated Analysis​

Error Handling​

Next Steps​