Overview
The HybridCryptoPredictor class intelligently combines two specialized models:
- XGBoost: Excels at short-term predictions (1-72 hours)
- Prophet: Excels at medium/long-term predictions (1 week - 1 month)
The hybrid model automatically selects the best predictor based on forecast horizon and creates weighted ensemble predictions when both models overlap.
Best for: All time horizons with automatic model selection
Constructor
from models.hybrid_model import HybridCryptoPredictor
predictor = HybridCryptoPredictor()
Parameters
No parameters required. The constructor automatically initializes:
XGBoostCryptoPredictor() with default parameters
ProphetCryptoPredictor() with default parameters
To use custom hyperparameters for the underlying models, you’ll need to modify them after initialization:predictor = HybridCryptoPredictor()
predictor.xgboost.model.n_estimators = 300
predictor.prophet.model.changepoint_prior_scale = 0.3
Methods
train()
Trains both XGBoost and Prophet models on the same dataset.
training_info = predictor.train(df)
print("XGBoost Metrics:", training_info['xgboost'])
print("Prophet Metrics:", training_info['prophet'])
print(f"Trained on {training_info['data_points']} data points")
Historical OHLCV data with:
- Datetime index
- Required columns:
open, high, low, close
- Optional:
volume
- Minimum 100 data points recommended
Training information for both models:{
'xgboost': { # XGBoost training metrics
'train_mae': float,
'test_mae': float,
'train_rmse': float,
'test_rmse': float,
'train_mape': float,
'test_mape': float,
'train_direction_accuracy': float,
'test_direction_accuracy': float
},
'prophet': { # Prophet training metrics
'mae': float,
'rmse': float,
'mape': float,
'direction_accuracy': float,
'training_points': int
},
'data_points': int # Total points used for training
}
predict_future()
Generates predictions using the optimal model(s) for the forecast horizon.
# Short-term (24 hours) - uses XGBoost
predictions = predictor.predict_future(df, periods=24)
print(f"Using: {predictions['recommended']}")
best = predictor.get_best_prediction(predictions)
# Medium-term (7 days) - uses weighted ensemble
predictions = predictor.predict_future(df, periods=168)
print(f"Using: {predictions['recommended']}")
print(f"Weights: {predictions['weights']}")
# Long-term (30 days) - uses Prophet
predictions = predictor.predict_future(df, periods=720)
print(f"Using: {predictions['recommended']}")
Historical data used for context in predictions.
Number of time periods (hours) to forecast.Selection logic:
periods <= 72: Primarily XGBoost
72 < periods <= 168: Weighted ensemble (both models)
periods > 168: Primarily Prophet
Dictionary containing predictions from available models:{
'xgboost': pd.DataFrame, # XGBoost predictions (if periods <= 72)
'prophet': pd.DataFrame, # Prophet predictions (if periods > 24)
'hybrid': pd.DataFrame, # Weighted ensemble (if both available)
'recommended': str, # 'xgboost', 'prophet', or 'hybrid'
'weights': { # Ensemble weights (if hybrid)
'xgboost': float, # 0.0 to 1.0
'prophet': float # 0.0 to 1.0
}
}
DataFrame columns:
predicted_price: Point forecast
lower_bound: Lower confidence interval
upper_bound: Upper confidence interval
trend: Underlying trend (Prophet predictions only)
get_best_prediction()
Extracts the recommended prediction from the predictions dictionary.
predictions = predictor.predict_future(df, periods=100)
best = predictor.get_best_prediction(predictions)
print(best)
# predicted_price lower_bound upper_bound
# timestamp
# 2026-03-08 01:00:00 42500.32 41200.15 43800.49
# ...
Dictionary returned by predict_future().
The recommended prediction DataFrame based on forecast horizon:
- Returns
hybrid if available
- Otherwise returns the model specified in
predictions['recommended']
get_training_info()
Returns training metrics for both models.
info = predictor.get_training_info()
print(f"XGBoost Test MAPE: {info['xgboost']['test_mape']:.2f}%")
print(f"Prophet MAPE: {info['prophet']['mape']:.2f}%")
Same dictionary returned by train() method.
Ensemble Weighting Strategy
The hybrid model uses dynamic weighting based on forecast horizon:
# Weight calculation (from source code)
xgb_weight = max(0, 1 - (periods / 168))
prophet_weight = 1 - xgb_weight
hybrid_prediction = (
xgboost_prediction * xgb_weight +
prophet_prediction * prophet_weight
)
Examples:
| Periods | Horizon | XGBoost Weight | Prophet Weight | Strategy |
|---|
| 24 | 1 day | 0.86 | 0.14 | Mostly XGBoost |
| 72 | 3 days | 0.57 | 0.43 | Balanced |
| 168 | 1 week | 0.00 | 1.00 | All Prophet |
| 720 | 30 days | 0.00 | 1.00 | All Prophet |
The weighting smoothly transitions from XGBoost to Prophet as the forecast horizon extends, combining the strengths of both models in the overlap zone.
Complete Example
import pandas as pd
from models.hybrid_model import HybridCryptoPredictor
import matplotlib.pyplot as plt
# Load historical data
df = pd.read_csv('btc_hourly.csv', index_col='timestamp', parse_dates=True)
print(f"Loaded {len(df)} hours of data")
# Initialize hybrid predictor
predictor = HybridCryptoPredictor()
# Train both models
print("\nTraining both models...")
training_info = predictor.train(df)
print("\n" + "="*50)
print("TRAINING RESULTS")
print("="*50)
print(f"\nXGBoost Performance:")
print(f" Test MAPE: {training_info['xgboost']['test_mape']:.2f}%")
print(f" Test Direction Accuracy: {training_info['xgboost']['test_direction_accuracy']:.2f}%")
print(f"\nProphet Performance:")
print(f" MAPE: {training_info['prophet']['mape']:.2f}%")
print(f" Direction Accuracy: {training_info['prophet']['direction_accuracy']:.2f}%")
# Short-term prediction (24 hours)
print("\n" + "="*50)
print("SHORT-TERM FORECAST (24 hours)")
print("="*50)
short_pred = predictor.predict_future(df, periods=24)
short_best = predictor.get_best_prediction(short_pred)
print(f"Recommended model: {short_pred['recommended']}")
print(f"\nFirst 5 predictions:")
print(short_best.head())
current_price = df['close'].iloc[-1]
future_price = short_best['predicted_price'].iloc[-1]
print(f"\nCurrent price: ${current_price:.2f}")
print(f"24h prediction: ${future_price:.2f}")
print(f"Expected change: {((future_price - current_price) / current_price * 100):+.2f}%")
# Medium-term prediction (7 days / 168 hours)
print("\n" + "="*50)
print("MEDIUM-TERM FORECAST (7 days)")
print("="*50)
medium_pred = predictor.predict_future(df, periods=168)
medium_best = predictor.get_best_prediction(medium_pred)
print(f"Recommended model: {medium_pred['recommended']}")
if 'weights' in medium_pred:
print(f"Ensemble weights:")
print(f" XGBoost: {medium_pred['weights']['xgboost']:.2%}")
print(f" Prophet: {medium_pred['weights']['prophet']:.2%}")
future_price_7d = medium_best['predicted_price'].iloc[-1]
print(f"\n7-day prediction: ${future_price_7d:.2f}")
print(f"Expected change: {((future_price_7d - current_price) / current_price * 100):+.2f}%")
print(f"Confidence range: ${medium_best['lower_bound'].iloc[-1]:.2f} - ${medium_best['upper_bound'].iloc[-1]:.2f}")
# Long-term prediction (30 days / 720 hours)
print("\n" + "="*50)
print("LONG-TERM FORECAST (30 days)")
print("="*50)
long_pred = predictor.predict_future(df, periods=720)
long_best = predictor.get_best_prediction(long_pred)
print(f"Recommended model: {long_pred['recommended']}")
future_price_30d = long_best['predicted_price'].iloc[-1]
print(f"\n30-day prediction: ${future_price_30d:.2f}")
print(f"Expected change: {((future_price_30d - current_price) / current_price * 100):+.2f}%")
print(f"Confidence range: ${long_best['lower_bound'].iloc[-1]:.2f} - ${long_best['upper_bound'].iloc[-1]:.2f}")
if 'trend' in long_best.columns:
trend_change = long_best['trend'].iloc[-1] - long_best['trend'].iloc[0]
print(f"Trend direction: {'Bullish ↑' if trend_change > 0 else 'Bearish ↓'}")
# Save all predictions
short_best.to_csv('hybrid_predictions_24h.csv')
medium_best.to_csv('hybrid_predictions_7d.csv')
long_best.to_csv('hybrid_predictions_30d.csv')
print("\nPredictions saved to CSV files.")
# Optional: Plot comparison
fig, axes = plt.subplots(3, 1, figsize=(12, 10))
# Plot 24h
axes[0].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[0].plot(short_best.index, short_best['predicted_price'], label='Predicted (XGBoost)', color='orange', linewidth=2)
if 'lower_bound' in short_best.columns:
axes[0].fill_between(short_best.index, short_best['lower_bound'], short_best['upper_bound'], alpha=0.2, color='orange')
axes[0].set_title('24-Hour Forecast (XGBoost)', fontsize=12, fontweight='bold')
axes[0].legend()
axes[0].grid(alpha=0.3)
# Plot 7d
axes[1].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[1].plot(medium_best.index, medium_best['predicted_price'], label='Predicted (Hybrid)', color='purple', linewidth=2)
if 'lower_bound' in medium_best.columns:
axes[1].fill_between(medium_best.index, medium_best['lower_bound'], medium_best['upper_bound'], alpha=0.2, color='purple')
axes[1].set_title('7-Day Forecast (Hybrid Ensemble)', fontsize=12, fontweight='bold')
axes[1].legend()
axes[1].grid(alpha=0.3)
# Plot 30d
axes[2].plot(df.index[-168:], df['close'].iloc[-168:], label='Historical', linewidth=2)
axes[2].plot(long_best.index, long_best['predicted_price'], label='Predicted (Prophet)', color='green', linewidth=2)
if 'lower_bound' in long_best.columns:
axes[2].fill_between(long_best.index, long_best['lower_bound'], long_best['upper_bound'], alpha=0.2, color='green')
axes[2].set_title('30-Day Forecast (Prophet)', fontsize=12, fontweight='bold')
axes[2].legend()
axes[2].grid(alpha=0.3)
plt.tight_layout()
plt.savefig('hybrid_predictions_comparison.png', dpi=300, bbox_inches='tight')
print("Comparison plot saved to hybrid_predictions_comparison.png")
Decision Logic
The hybrid model automatically selects the best approach:
# Pseudocode of internal logic
if periods <= 24:
# Very short-term: XGBoost only
return xgboost_predictions
elif 24 < periods <= 72:
# Short-term: XGBoost with Prophet starting
# XGBoost is primary, Prophet available for comparison
return {
'xgboost': xgboost_predictions,
'prophet': prophet_predictions,
'hybrid': weighted_ensemble,
'recommended': 'xgboost'
}
elif 72 < periods <= 168:
# Medium-term: Balanced ensemble
# Both models weighted based on horizon
return {
'xgboost': xgboost_predictions,
'prophet': prophet_predictions,
'hybrid': weighted_ensemble,
'recommended': 'hybrid'
}
else: # periods > 168
# Long-term: Prophet only
return prophet_predictions
Key Characteristics
Strengths:
- Optimal accuracy across all time horizons
- Automatic model selection
- Smooth transitions between models (no abrupt changes)
- Leverages strengths of both algorithms
- Single API for all forecasting needs
- Confidence intervals from both models
Limitations:
- Slower training (trains two models)
- Higher memory usage (stores two models)
- Ensemble weights are heuristic (not learned)
- Cannot customize individual model parameters easily
Typical Performance:
- 1-24 hours: MAPE 2-5% (XGBoost)
- 3-7 days: MAPE 3-7% (Hybrid)
- 7-30 days: MAPE 5-12% (Prophet)
- Direction Accuracy: 55-65%
When to Use Hybrid:
- Need predictions across multiple time horizons
- Want automatic model selection
- Prefer ensemble approach for robustness
- Building production forecasting systems
- Comparing short-term vs long-term outlooks
When NOT to Use Hybrid:
- Only need one specific time horizon (use specialized model)
- Limited computational resources (use single model)
- Need to fine-tune hyperparameters extensively
- Require real-time predictions (<100ms)