Skip to main content

Overview

The HybridCryptoPredictor class intelligently combines two specialized models:
  • XGBoost: Excels at short-term predictions (1-72 hours)
  • Prophet: Excels at medium/long-term predictions (1 week - 1 month)
The hybrid model automatically selects the best predictor based on forecast horizon and creates weighted ensemble predictions when both models overlap. Best for: All time horizons with automatic model selection

Constructor

from models.hybrid_model import HybridCryptoPredictor

predictor = HybridCryptoPredictor()

Parameters

No parameters required. The constructor automatically initializes:
  • XGBoostCryptoPredictor() with default parameters
  • ProphetCryptoPredictor() with default parameters
To use custom hyperparameters for the underlying models, you’ll need to modify them after initialization:
predictor = HybridCryptoPredictor()
predictor.xgboost.model.n_estimators = 300
predictor.prophet.model.changepoint_prior_scale = 0.3

Methods

train()

Trains both XGBoost and Prophet models on the same dataset.
training_info = predictor.train(df)

print("XGBoost Metrics:", training_info['xgboost'])
print("Prophet Metrics:", training_info['prophet'])
print(f"Trained on {training_info['data_points']} data points")
df
pd.DataFrame
required
Historical OHLCV data with:
  • Datetime index
  • Required columns: open, high, low, close
  • Optional: volume
  • Minimum 100 data points recommended
return
Dict
Training information for both models:
{
    'xgboost': {                     # XGBoost training metrics
        'train_mae': float,
        'test_mae': float,
        'train_rmse': float,
        'test_rmse': float,
        'train_mape': float,
        'test_mape': float,
        'train_direction_accuracy': float,
        'test_direction_accuracy': float
    },
    'prophet': {                     # Prophet training metrics
        'mae': float,
        'rmse': float,
        'mape': float,
        'direction_accuracy': float,
        'training_points': int
    },
    'data_points': int               # Total points used for training
}

predict_future()

Generates predictions using the optimal model(s) for the forecast horizon.
# Short-term (24 hours) - uses XGBoost
predictions = predictor.predict_future(df, periods=24)
print(f"Using: {predictions['recommended']}")
best = predictor.get_best_prediction(predictions)

# Medium-term (7 days) - uses weighted ensemble
predictions = predictor.predict_future(df, periods=168)
print(f"Using: {predictions['recommended']}")
print(f"Weights: {predictions['weights']}")

# Long-term (30 days) - uses Prophet
predictions = predictor.predict_future(df, periods=720)
print(f"Using: {predictions['recommended']}")
df
pd.DataFrame
required
Historical data used for context in predictions.
periods
int
required
Number of time periods (hours) to forecast.Selection logic:
  • periods <= 72: Primarily XGBoost
  • 72 < periods <= 168: Weighted ensemble (both models)
  • periods > 168: Primarily Prophet
return
Dict[str, pd.DataFrame]
Dictionary containing predictions from available models:
{
    'xgboost': pd.DataFrame,         # XGBoost predictions (if periods <= 72)
    'prophet': pd.DataFrame,          # Prophet predictions (if periods > 24)
    'hybrid': pd.DataFrame,           # Weighted ensemble (if both available)
    'recommended': str,               # 'xgboost', 'prophet', or 'hybrid'
    'weights': {                      # Ensemble weights (if hybrid)
        'xgboost': float,             # 0.0 to 1.0
        'prophet': float              # 0.0 to 1.0
    }
}
DataFrame columns:
  • predicted_price: Point forecast
  • lower_bound: Lower confidence interval
  • upper_bound: Upper confidence interval
  • trend: Underlying trend (Prophet predictions only)

get_best_prediction()

Extracts the recommended prediction from the predictions dictionary.
predictions = predictor.predict_future(df, periods=100)
best = predictor.get_best_prediction(predictions)

print(best)
#                      predicted_price  lower_bound  upper_bound
# timestamp                           
# 2026-03-08 01:00:00      42500.32    41200.15    43800.49
# ...
predictions
Dict
required
Dictionary returned by predict_future().
return
pd.DataFrame
The recommended prediction DataFrame based on forecast horizon:
  • Returns hybrid if available
  • Otherwise returns the model specified in predictions['recommended']

get_training_info()

Returns training metrics for both models.
info = predictor.get_training_info()
print(f"XGBoost Test MAPE: {info['xgboost']['test_mape']:.2f}%")
print(f"Prophet MAPE: {info['prophet']['mape']:.2f}%")
return
Dict
Same dictionary returned by train() method.

Ensemble Weighting Strategy

The hybrid model uses dynamic weighting based on forecast horizon:
# Weight calculation (from source code)
xgb_weight = max(0, 1 - (periods / 168))
prophet_weight = 1 - xgb_weight

hybrid_prediction = (
    xgboost_prediction * xgb_weight +
    prophet_prediction * prophet_weight
)
Examples:
PeriodsHorizonXGBoost WeightProphet WeightStrategy
241 day0.860.14Mostly XGBoost
723 days0.570.43Balanced
1681 week0.001.00All Prophet
72030 days0.001.00All Prophet
The weighting smoothly transitions from XGBoost to Prophet as the forecast horizon extends, combining the strengths of both models in the overlap zone.

Complete Example

import pandas as pd
from models.hybrid_model import HybridCryptoPredictor
import matplotlib.pyplot as plt

# Load historical data
df = pd.read_csv('btc_hourly.csv', index_col='timestamp', parse_dates=True)
print(f"Loaded {len(df)} hours of data")

# Initialize hybrid predictor
predictor = HybridCryptoPredictor()

# Train both models
print("\nTraining both models...")
training_info = predictor.train(df)

print("\n" + "="*50)
print("TRAINING RESULTS")
print("="*50)
print(f"\nXGBoost Performance:")
print(f"  Test MAPE: {training_info['xgboost']['test_mape']:.2f}%")
print(f"  Test Direction Accuracy: {training_info['xgboost']['test_direction_accuracy']:.2f}%")

print(f"\nProphet Performance:")
print(f"  MAPE: {training_info['prophet']['mape']:.2f}%")
print(f"  Direction Accuracy: {training_info['prophet']['direction_accuracy']:.2f}%")

# Short-term prediction (24 hours)
print("\n" + "="*50)
print("SHORT-TERM FORECAST (24 hours)")
print("="*50)

short_pred = predictor.predict_future(df, periods=24)
short_best = predictor.get_best_prediction(short_pred)

print(f"Recommended model: {short_pred['recommended']}")
print(f"\nFirst 5 predictions:")
print(short_best.head())

current_price = df['close'].iloc[-1]
future_price = short_best['predicted_price'].iloc[-1]
print(f"\nCurrent price: ${current_price:.2f}")
print(f"24h prediction: ${future_price:.2f}")
print(f"Expected change: {((future_price - current_price) / current_price * 100):+.2f}%")

# Medium-term prediction (7 days / 168 hours)
print("\n" + "="*50)
print("MEDIUM-TERM FORECAST (7 days)")
print("="*50)

medium_pred = predictor.predict_future(df, periods=168)
medium_best = predictor.get_best_prediction(medium_pred)

print(f"Recommended model: {medium_pred['recommended']}")

if 'weights' in medium_pred:
    print(f"Ensemble weights:")
    print(f"  XGBoost: {medium_pred['weights']['xgboost']:.2%}")
    print(f"  Prophet: {medium_pred['weights']['prophet']:.2%}")

future_price_7d = medium_best['predicted_price'].iloc[-1]
print(f"\n7-day prediction: ${future_price_7d:.2f}")
print(f"Expected change: {((future_price_7d - current_price) / current_price * 100):+.2f}%")
print(f"Confidence range: ${medium_best['lower_bound'].iloc[-1]:.2f} - ${medium_best['upper_bound'].iloc[-1]:.2f}")

# Long-term prediction (30 days / 720 hours)
print("\n" + "="*50)
print("LONG-TERM FORECAST (30 days)")
print("="*50)

long_pred = predictor.predict_future(df, periods=720)
long_best = predictor.get_best_prediction(long_pred)

print(f"Recommended model: {long_pred['recommended']}")

future_price_30d = long_best['predicted_price'].iloc[-1]
print(f"\n30-day prediction: ${future_price_30d:.2f}")
print(f"Expected change: {((future_price_30d - current_price) / current_price * 100):+.2f}%")
print(f"Confidence range: ${long_best['lower_bound'].iloc[-1]:.2f} - ${long_best['upper_bound'].iloc[-1]:.2f}")

if 'trend' in long_best.columns:
    trend_change = long_best['trend'].iloc[-1] - long_best['trend'].iloc[0]
    print(f"Trend direction: {'Bullish ↑' if trend_change > 0 else 'Bearish ↓'}")

# Save all predictions
short_best.to_csv('hybrid_predictions_24h.csv')
medium_best.to_csv('hybrid_predictions_7d.csv')
long_best.to_csv('hybrid_predictions_30d.csv')

print("\nPredictions saved to CSV files.")

# Optional: Plot comparison
fig, axes = plt.subplots(3, 1, figsize=(12, 10))

# Plot 24h
axes[0].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[0].plot(short_best.index, short_best['predicted_price'], label='Predicted (XGBoost)', color='orange', linewidth=2)
if 'lower_bound' in short_best.columns:
    axes[0].fill_between(short_best.index, short_best['lower_bound'], short_best['upper_bound'], alpha=0.2, color='orange')
axes[0].set_title('24-Hour Forecast (XGBoost)', fontsize=12, fontweight='bold')
axes[0].legend()
axes[0].grid(alpha=0.3)

# Plot 7d
axes[1].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[1].plot(medium_best.index, medium_best['predicted_price'], label='Predicted (Hybrid)', color='purple', linewidth=2)
if 'lower_bound' in medium_best.columns:
    axes[1].fill_between(medium_best.index, medium_best['lower_bound'], medium_best['upper_bound'], alpha=0.2, color='purple')
axes[1].set_title('7-Day Forecast (Hybrid Ensemble)', fontsize=12, fontweight='bold')
axes[1].legend()
axes[1].grid(alpha=0.3)

# Plot 30d
axes[2].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[2].plot(long_best.index, long_best['predicted_price'], label='Predicted (Prophet)', color='green', linewidth=2)
if 'lower_bound' in long_best.columns:
    axes[2].fill_between(long_best.index, long_best['lower_bound'], long_best['upper_bound'], alpha=0.2, color='green')
axes[2].set_title('30-Day Forecast (Prophet)', fontsize=12, fontweight='bold')
axes[2].legend()
axes[2].grid(alpha=0.3)

plt.tight_layout()
plt.savefig('hybrid_predictions_comparison.png', dpi=300, bbox_inches='tight')
print("Comparison plot saved to hybrid_predictions_comparison.png")

Decision Logic

The hybrid model automatically selects the best approach:
# Pseudocode of internal logic
if periods <= 24:
    # Very short-term: XGBoost only
    return xgboost_predictions
    
elif 24 < periods <= 72:
    # Short-term: XGBoost with Prophet starting
    # XGBoost is primary, Prophet available for comparison
    return {
        'xgboost': xgboost_predictions,
        'prophet': prophet_predictions,
        'hybrid': weighted_ensemble,
        'recommended': 'xgboost'
    }
    
elif 72 < periods <= 168:
    # Medium-term: Balanced ensemble
    # Both models weighted based on horizon
    return {
        'xgboost': xgboost_predictions,
        'prophet': prophet_predictions,
        'hybrid': weighted_ensemble,
        'recommended': 'hybrid'
    }
    
else:  # periods > 168
    # Long-term: Prophet only
    return prophet_predictions

Key Characteristics

Strengths:
  • Optimal accuracy across all time horizons
  • Automatic model selection
  • Smooth transitions between models (no abrupt changes)
  • Leverages strengths of both algorithms
  • Single API for all forecasting needs
  • Confidence intervals from both models
Limitations:
  • Slower training (trains two models)
  • Higher memory usage (stores two models)
  • Ensemble weights are heuristic (not learned)
  • Cannot customize individual model parameters easily
Typical Performance:
  • 1-24 hours: MAPE 2-5% (XGBoost)
  • 3-7 days: MAPE 3-7% (Hybrid)
  • 7-30 days: MAPE 5-12% (Prophet)
  • Direction Accuracy: 55-65%
When to Use Hybrid:
  • Need predictions across multiple time horizons
  • Want automatic model selection
  • Prefer ensemble approach for robustness
  • Building production forecasting systems
  • Comparing short-term vs long-term outlooks
When NOT to Use Hybrid:
  • Only need one specific time horizon (use specialized model)
  • Limited computational resources (use single model)
  • Need to fine-tune hyperparameters extensively
  • Require real-time predictions (<100ms)

Build docs developers (and LLMs) love