HybridCryptoPredictor

Overview

The HybridCryptoPredictor class intelligently combines two specialized models:

XGBoost: Excels at short-term predictions (1-72 hours)
Prophet: Excels at medium/long-term predictions (1 week - 1 month)

The hybrid model automatically selects the best predictor based on forecast horizon and creates weighted ensemble predictions when both models overlap. Best for: All time horizons with automatic model selection

Constructor

from models.hybrid_model import HybridCryptoPredictor

predictor = HybridCryptoPredictor()

Parameters

No parameters required. The constructor automatically initializes:

XGBoostCryptoPredictor() with default parameters
ProphetCryptoPredictor() with default parameters

To use custom hyperparameters for the underlying models, you’ll need to modify them after initialization:

predictor = HybridCryptoPredictor()
predictor.xgboost.model.n_estimators = 300
predictor.prophet.model.changepoint_prior_scale = 0.3

Methods

train()

Trains both XGBoost and Prophet models on the same dataset.

training_info = predictor.train(df)

print("XGBoost Metrics:", training_info['xgboost'])
print("Prophet Metrics:", training_info['prophet'])
print(f"Trained on {training_info['data_points']} data points")

pd.DataFrame

required

Historical OHLCV data with:

Datetime index
Required columns: open, high, low, close
Optional: volume
Minimum 100 data points recommended

return

Dict

Training information for both models:

{
    'xgboost': {                     # XGBoost training metrics
        'train_mae': float,
        'test_mae': float,
        'train_rmse': float,
        'test_rmse': float,
        'train_mape': float,
        'test_mape': float,
        'train_direction_accuracy': float,
        'test_direction_accuracy': float
    },
    'prophet': {                     # Prophet training metrics
        'mae': float,
        'rmse': float,
        'mape': float,
        'direction_accuracy': float,
        'training_points': int
    },
    'data_points': int               # Total points used for training
}

predict_future()

Generates predictions using the optimal model(s) for the forecast horizon.

# Short-term (24 hours) - uses XGBoost
predictions = predictor.predict_future(df, periods=24)
print(f"Using: {predictions['recommended']}")
best = predictor.get_best_prediction(predictions)

# Medium-term (7 days) - uses weighted ensemble
predictions = predictor.predict_future(df, periods=168)
print(f"Using: {predictions['recommended']}")
print(f"Weights: {predictions['weights']}")

# Long-term (30 days) - uses Prophet
predictions = predictor.predict_future(df, periods=720)
print(f"Using: {predictions['recommended']}")

pd.DataFrame

required

Historical data used for context in predictions.

periods

int

required

Number of time periods (hours) to forecast.Selection logic:

periods <= 72: Primarily XGBoost
72 < periods <= 168: Weighted ensemble (both models)
periods > 168: Primarily Prophet

return

Dict[str, pd.DataFrame]

Dictionary containing predictions from available models:

{
    'xgboost': pd.DataFrame,         # XGBoost predictions (if periods <= 72)
    'prophet': pd.DataFrame,          # Prophet predictions (if periods > 24)
    'hybrid': pd.DataFrame,           # Weighted ensemble (if both available)
    'recommended': str,               # 'xgboost', 'prophet', or 'hybrid'
    'weights': {                      # Ensemble weights (if hybrid)
        'xgboost': float,             # 0.0 to 1.0
        'prophet': float              # 0.0 to 1.0
    }
}

DataFrame columns:

predicted_price: Point forecast
lower_bound: Lower confidence interval
upper_bound: Upper confidence interval
trend: Underlying trend (Prophet predictions only)

get_best_prediction()

Extracts the recommended prediction from the predictions dictionary.

predictions = predictor.predict_future(df, periods=100)
best = predictor.get_best_prediction(predictions)

print(best)
#                      predicted_price  lower_bound  upper_bound
# timestamp                           
# 2026-03-08 01:00:00      42500.32    41200.15    43800.49
# ...

predictions

Dict

required

Dictionary returned by predict_future().

return

pd.DataFrame

The recommended prediction DataFrame based on forecast horizon:

Returns hybrid if available
Otherwise returns the model specified in predictions['recommended']

get_training_info()

Returns training metrics for both models.

info = predictor.get_training_info()
print(f"XGBoost Test MAPE: {info['xgboost']['test_mape']:.2f}%")
print(f"Prophet MAPE: {info['prophet']['mape']:.2f}%")

return

Dict

Same dictionary returned by train() method.

Ensemble Weighting Strategy

The hybrid model uses dynamic weighting based on forecast horizon:

# Weight calculation (from source code)
xgb_weight = max(0, 1 - (periods / 168))
prophet_weight = 1 - xgb_weight

hybrid_prediction = (
    xgboost_prediction * xgb_weight +
    prophet_prediction * prophet_weight
)

Examples:

Periods	Horizon	XGBoost Weight	Prophet Weight	Strategy
24	1 day	0.86	0.14	Mostly XGBoost
72	3 days	0.57	0.43	Balanced
168	1 week	0.00	1.00	All Prophet
720	30 days	0.00	1.00	All Prophet

The weighting smoothly transitions from XGBoost to Prophet as the forecast horizon extends, combining the strengths of both models in the overlap zone.

Complete Example

import pandas as pd
from models.hybrid_model import HybridCryptoPredictor
import matplotlib.pyplot as plt

# Load historical data
df = pd.read_csv('btc_hourly.csv', index_col='timestamp', parse_dates=True)
print(f"Loaded {len(df)} hours of data")

# Initialize hybrid predictor
predictor = HybridCryptoPredictor()

# Train both models
print("\nTraining both models...")
training_info = predictor.train(df)

print("\n" + "="*50)
print("TRAINING RESULTS")
print("="*50)
print(f"\nXGBoost Performance:")
print(f"  Test MAPE: {training_info['xgboost']['test_mape']:.2f}%")
print(f"  Test Direction Accuracy: {training_info['xgboost']['test_direction_accuracy']:.2f}%")

print(f"\nProphet Performance:")
print(f"  MAPE: {training_info['prophet']['mape']:.2f}%")
print(f"  Direction Accuracy: {training_info['prophet']['direction_accuracy']:.2f}%")

# Short-term prediction (24 hours)
print("\n" + "="*50)
print("SHORT-TERM FORECAST (24 hours)")
print("="*50)

short_pred = predictor.predict_future(df, periods=24)
short_best = predictor.get_best_prediction(short_pred)

print(f"Recommended model: {short_pred['recommended']}")
print(f"\nFirst 5 predictions:")
print(short_best.head())

current_price = df['close'].iloc[-1]
future_price = short_best['predicted_price'].iloc[-1]
print(f"\nCurrent price: ${current_price:.2f}")
print(f"24h prediction: ${future_price:.2f}")
print(f"Expected change: {((future_price - current_price) / current_price * 100):+.2f}%")

# Medium-term prediction (7 days / 168 hours)
print("\n" + "="*50)
print("MEDIUM-TERM FORECAST (7 days)")
print("="*50)

medium_pred = predictor.predict_future(df, periods=168)
medium_best = predictor.get_best_prediction(medium_pred)

print(f"Recommended model: {medium_pred['recommended']}")

if 'weights' in medium_pred:
    print(f"Ensemble weights:")
    print(f"  XGBoost: {medium_pred['weights']['xgboost']:.2%}")
    print(f"  Prophet: {medium_pred['weights']['prophet']:.2%}")

future_price_7d = medium_best['predicted_price'].iloc[-1]
print(f"\n7-day prediction: ${future_price_7d:.2f}")
print(f"Expected change: {((future_price_7d - current_price) / current_price * 100):+.2f}%")
print(f"Confidence range: ${medium_best['lower_bound'].iloc[-1]:.2f} - ${medium_best['upper_bound'].iloc[-1]:.2f}")

# Long-term prediction (30 days / 720 hours)
print("\n" + "="*50)
print("LONG-TERM FORECAST (30 days)")
print("="*50)

long_pred = predictor.predict_future(df, periods=720)
long_best = predictor.get_best_prediction(long_pred)

print(f"Recommended model: {long_pred['recommended']}")

future_price_30d = long_best['predicted_price'].iloc[-1]
print(f"\n30-day prediction: ${future_price_30d:.2f}")
print(f"Expected change: {((future_price_30d - current_price) / current_price * 100):+.2f}%")
print(f"Confidence range: ${long_best['lower_bound'].iloc[-1]:.2f} - ${long_best['upper_bound'].iloc[-1]:.2f}")

if 'trend' in long_best.columns:
    trend_change = long_best['trend'].iloc[-1] - long_best['trend'].iloc[0]
    print(f"Trend direction: {'Bullish ↑' if trend_change > 0 else 'Bearish ↓'}")

# Save all predictions
short_best.to_csv('hybrid_predictions_24h.csv')
medium_best.to_csv('hybrid_predictions_7d.csv')
long_best.to_csv('hybrid_predictions_30d.csv')

print("\nPredictions saved to CSV files.")

# Optional: Plot comparison
fig, axes = plt.subplots(3, 1, figsize=(12, 10))

# Plot 24h
axes[0].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[0].plot(short_best.index, short_best['predicted_price'], label='Predicted (XGBoost)', color='orange', linewidth=2)
if 'lower_bound' in short_best.columns:
    axes[0].fill_between(short_best.index, short_best['lower_bound'], short_best['upper_bound'], alpha=0.2, color='orange')
axes[0].set_title('24-Hour Forecast (XGBoost)', fontsize=12, fontweight='bold')
axes[0].legend()
axes[0].grid(alpha=0.3)

# Plot 7d
axes[1].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[1].plot(medium_best.index, medium_best['predicted_price'], label='Predicted (Hybrid)', color='purple', linewidth=2)
if 'lower_bound' in medium_best.columns:
    axes[1].fill_between(medium_best.index, medium_best['lower_bound'], medium_best['upper_bound'], alpha=0.2, color='purple')
axes[1].set_title('7-Day Forecast (Hybrid Ensemble)', fontsize=12, fontweight='bold')
axes[1].legend()
axes[1].grid(alpha=0.3)

# Plot 30d
axes[2].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[2].plot(long_best.index, long_best['predicted_price'], label='Predicted (Prophet)', color='green', linewidth=2)
if 'lower_bound' in long_best.columns:
    axes[2].fill_between(long_best.index, long_best['lower_bound'], long_best['upper_bound'], alpha=0.2, color='green')
axes[2].set_title('30-Day Forecast (Prophet)', fontsize=12, fontweight='bold')
axes[2].legend()
axes[2].grid(alpha=0.3)

plt.tight_layout()
plt.savefig('hybrid_predictions_comparison.png', dpi=300, bbox_inches='tight')
print("Comparison plot saved to hybrid_predictions_comparison.png")

Decision Logic

The hybrid model automatically selects the best approach:

# Pseudocode of internal logic
if periods <= 24:
    # Very short-term: XGBoost only
    return xgboost_predictions
    
elif 24 < periods <= 72:
    # Short-term: XGBoost with Prophet starting
    # XGBoost is primary, Prophet available for comparison
    return {
        'xgboost': xgboost_predictions,
        'prophet': prophet_predictions,
        'hybrid': weighted_ensemble,
        'recommended': 'xgboost'
    }
    
elif 72 < periods <= 168:
    # Medium-term: Balanced ensemble
    # Both models weighted based on horizon
    return {
        'xgboost': xgboost_predictions,
        'prophet': prophet_predictions,
        'hybrid': weighted_ensemble,
        'recommended': 'hybrid'
    }
    
else:  # periods > 168
    # Long-term: Prophet only
    return prophet_predictions

Key Characteristics

Strengths:

Optimal accuracy across all time horizons
Automatic model selection
Smooth transitions between models (no abrupt changes)
Leverages strengths of both algorithms
Single API for all forecasting needs
Confidence intervals from both models

Limitations:

Slower training (trains two models)
Higher memory usage (stores two models)
Ensemble weights are heuristic (not learned)
Cannot customize individual model parameters easily

Typical Performance:

1-24 hours: MAPE 2-5% (XGBoost)
3-7 days: MAPE 3-7% (Hybrid)
7-30 days: MAPE 5-12% (Prophet)
Direction Accuracy: 55-65%

When to Use Hybrid:

Need predictions across multiple time horizons
Want automatic model selection
Prefer ensemble approach for robustness
Building production forecasting systems
Comparing short-term vs long-term outlooks

When NOT to Use Hybrid:

Only need one specific time horizon (use specialized model)
Limited computational resources (use single model)
Need to fine-tune hyperparameters extensively
Require real-time predictions (<100ms)

Models

Data

Utilities

Overview

Constructor

Parameters

Methods

train()

predict_future()

get_best_prediction()

get_training_info()

Ensemble Weighting Strategy

Complete Example

Decision Logic

Key Characteristics

Build docs developers (and LLMs) love

Models

Data

Utilities

​Overview

​Constructor

​Parameters

​Methods

​train()

​predict_future()

​get_best_prediction()

​get_training_info()

​Ensemble Weighting Strategy

​Complete Example

​Decision Logic

​Key Characteristics

Build docs developers (and LLMs) love

Overview

Constructor

Parameters

Methods

train()

predict_future()

get_best_prediction()

get_training_info()

Ensemble Weighting Strategy

Complete Example

Decision Logic

Key Characteristics