Overview
The Hybrid model intelligently combines XGBoost (short-term) and Prophet (long-term) to provide the best prediction for any time horizon. It automatically selects or blends models based on the forecast period.
Best for: All-purpose predictions from 1 hour to 1 month
Source: source/models/hybrid_model.py
How It Works
Train Both Models
Trains XGBoost and Prophet on your historical data in parallel
Evaluate Horizon
Determines prediction timeframe (short vs long term)
Select or Blend
≤72 hours: XGBoost only
24-72 hours: Both models with weighted ensemble
>72 hours: Prophet only or ensemble
Return Best Prediction
Provides the optimal prediction with confidence intervals
Model Selection Logic
The hybrid model uses different strategies based on prediction horizon:
if periods <= 24 :
# Short-term: XGBoost only
use_model = "xgboost"
elif 24 < periods <= 72 :
# Medium-term: Ensemble both models
xgb_weight = 1 - (periods / 168 )
prophet_weight = 1 - xgb_weight
use_model = "hybrid"
else : # periods > 72
# Long-term: Prophet dominant
use_model = "hybrid" or "prophet"
The weighting formula gradually transitions from XGBoost to Prophet as the horizon increases.
Usage Example
import pandas as pd
from models.hybrid_model import HybridCryptoPredictor
# Initialize hybrid model
predictor = HybridCryptoPredictor()
# Load historical data
df = pd.read_csv( 'btc_hourly.csv' , index_col = 'timestamp' , parse_dates = True )
# Train both models at once
training_info = predictor.train(df)
print ( "XGBoost metrics:" , training_info[ 'xgboost' ])
print ( "Prophet metrics:" , training_info[ 'prophet' ])
print ( f "Trained on { training_info[ 'data_points' ] } data points" )
# Predict next 48 hours (will use ensemble)
predictions = predictor.predict_future(df, periods = 48 )
print ( f "Recommended model: { predictions[ 'recommended' ] } " )
print ( f "Weights: { predictions.get( 'weights' , 'N/A' ) } " )
# Get best prediction
best = predictor.get_best_prediction(predictions)
print (best[[ 'predicted_price' , 'lower_bound' , 'upper_bound' ]])
Prediction Output
The predict_future() method returns a dictionary with multiple predictions:
{
'xgboost' : pd.DataFrame, # XGBoost predictions (if applicable)
'prophet' : pd.DataFrame, # Prophet predictions (if applicable)
'hybrid' : pd.DataFrame, # Ensemble predictions (if applicable)
'recommended' : str , # Which model to use: 'xgboost', 'prophet', or 'hybrid'
'weights' : { # Ensemble weights (if hybrid)
'xgboost' : float ,
'prophet' : float
}
}
DataFrame Structure
Each prediction DataFrame contains:
Column Source Description predicted_priceBoth Point estimate lower_boundBoth Lower confidence bound (95%) upper_boundBoth Upper confidence bound (95%) trendProphet only Trend component
For periods between 24-72 hours, the hybrid model blends predictions:
# Calculate weights based on horizon
xgb_weight = max ( 0 , 1 - (periods / 168 ))
prophet_weight = 1 - xgb_weight
# Weighted average of predictions
hybrid_price = (xgb_pred * xgb_weight) + (prophet_pred * prophet_weight)
Example Weights
Periods Hours XGBoost Weight Prophet Weight Strategy 12 12h 93% 7% Mostly XGBoost 24 1d 86% 14% Mostly XGBoost 48 2d 71% 29% XGBoost dominant 72 3d 57% 43% Balanced 96 4d 43% 57% Prophet dominant 168 7d 0% 100% Pure Prophet
The weighting smoothly transitions from XGBoost to Prophet, avoiding sudden jumps in predictions.
Training Process
Sequential Training
The hybrid model trains both sub-models:
def train ( self , df : pd.DataFrame) -> Dict:
# 1. Train XGBoost (80/20 split)
xgb_metrics = self .xgboost.train(df, train_size = 0.8 )
# 2. Train Prophet (all data)
prophet_metrics = self .prophet.train(df)
# 3. Return combined metrics
return {
'xgboost' : xgb_metrics,
'prophet' : prophet_metrics,
'data_points' : len (df)
}
Training Time
XGBoost: 5-15 seconds (1000 data points)
Prophet: 10-30 seconds (1000 data points)
Total: ~15-45 seconds for full hybrid training
Training happens once. After training, predictions are fast regardless of horizon.
Getting the Best Prediction
Use get_best_prediction() to automatically select the optimal forecast:
# Predict multiple horizons
short_term = predictor.predict_future(df, periods = 12 )
medium_term = predictor.predict_future(df, periods = 48 )
long_term = predictor.predict_future(df, periods = 168 )
# Get best for each
best_short = predictor.get_best_prediction(short_term) # Uses XGBoost
best_medium = predictor.get_best_prediction(medium_term) # Uses Hybrid
best_long = predictor.get_best_prediction(long_term) # Uses Prophet
This method implements the selection logic:
def get_best_prediction ( self , predictions : Dict) -> pd.DataFrame:
if 'hybrid' in predictions:
return predictions[ 'hybrid' ]
elif predictions.get( 'recommended' ) == 'xgboost' :
return predictions[ 'xgboost' ]
else :
return predictions[ 'prophet' ]
Confidence Intervals
The hybrid model preserves confidence intervals from both models:
# XGBoost intervals (statistical estimation)
from models.xgboost_model import create_prediction_intervals
xgb_with_intervals = create_prediction_intervals(xgb_predictions)
# Prophet intervals (native to Prophet)
prophet_predictions # Already includes lower_bound and upper_bound
# Hybrid intervals (weighted average of both)
hybrid[ 'lower_bound' ] = (
xgb[ 'lower_bound' ] * xgb_weight +
prophet[ 'lower_bound' ] * prophet_weight
)
hybrid[ 'upper_bound' ] = (
xgb[ 'upper_bound' ] * xgb_weight +
prophet[ 'upper_bound' ] * prophet_weight
)
Confidence intervals widen as the prediction horizon increases, reflecting greater uncertainty.
Use Cases by Horizon
1-24 Hours
1-3 Days
1 Week+
Recommended: XGBoost onlypredictions = predictor.predict_future(df, periods = 24 )
# predictions['recommended'] == 'xgboost'
Why: XGBoost has superior accuracy for short-term predictions using recent price action and technical indicators.Recommended: Hybrid ensemblepredictions = predictor.predict_future(df, periods = 48 )
# predictions['recommended'] == 'hybrid'
# predictions['weights'] == {'xgboost': 0.71, 'prophet': 0.29}
Why: Combines XGBoost’s short-term accuracy with Prophet’s trend detection.Recommended: Prophet onlypredictions = predictor.predict_future(df, periods = 168 )
# predictions['recommended'] == 'prophet'
Why: Prophet excels at long-term trends and seasonality. XGBoost error accumulates over time.
Advanced: Accessing All Predictions
Inspect predictions from both models individually:
predictions = predictor.predict_future(df, periods = 48 )
# Access individual model predictions
if 'xgboost' in predictions:
xgb_df = predictions[ 'xgboost' ]
print ( "XGBoost prediction:" , xgb_df[ 'predicted_price' ].iloc[ 0 ])
if 'prophet' in predictions:
prophet_df = predictions[ 'prophet' ]
print ( "Prophet prediction:" , prophet_df[ 'predicted_price' ].iloc[ 0 ])
print ( "Prophet trend:" , prophet_df[ 'trend' ].iloc[ 0 ])
if 'hybrid' in predictions:
hybrid_df = predictions[ 'hybrid' ]
print ( "Hybrid prediction:" , hybrid_df[ 'predicted_price' ].iloc[ 0 ])
Comparison Example
import matplotlib.pyplot as plt
# Get predictions for 72 hours
predictions = predictor.predict_future(df, periods = 72 )
xgb = predictions[ 'xgboost' ][ 'predicted_price' ]
prophet = predictions[ 'prophet' ][ 'predicted_price' ]
hybrid = predictions[ 'hybrid' ][ 'predicted_price' ]
# Plot all three
plt.figure( figsize = ( 12 , 6 ))
plt.plot(xgb.index, xgb.values, label = 'XGBoost' , alpha = 0.7 )
plt.plot(prophet.index, prophet.values, label = 'Prophet' , alpha = 0.7 )
plt.plot(hybrid.index, hybrid.values, label = 'Hybrid' , linewidth = 2 )
plt.legend()
plt.title( 'Model Comparison: 72-hour Forecast' )
plt.show()
Metric XGBoost Prophet Hybrid Short-term (1-24h) ⭐⭐⭐⭐⭐ ⭐⭐ ⭐⭐⭐⭐⭐ Medium-term (1-3d) ⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐⭐ Long-term (1w+) ⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ Training Speed Fast Slow Slow Prediction Speed Slow (iterative) Fast Fast Confidence Intervals Estimated Native Both Interpretability Low High Medium
Retrieve detailed training metrics:
training_info = predictor.train(df)
# XGBoost metrics
print ( " \n XGBoost Performance:" )
print ( f " Test MAPE: { training_info[ 'xgboost' ][ 'test_mape' ] :.2f} %" )
print ( f " Direction Accuracy: { training_info[ 'xgboost' ][ 'test_direction_accuracy' ] :.2f} %" )
# Prophet metrics
print ( " \n Prophet Performance:" )
print ( f " MAPE: { training_info[ 'prophet' ][ 'mape' ] :.2f} %" )
print ( f " Direction Accuracy: { training_info[ 'prophet' ][ 'direction_accuracy' ] :.2f} %" )
# Data info
print ( f " \n Trained on { training_info[ 'data_points' ] } data points" )
# Get detailed info
detailed = predictor.get_training_info()
Best Practices
Always Check Recommended Model
predictions = predictor.predict_future(df, periods = 48 )
print ( f "Use: { predictions[ 'recommended' ] } " )
# Use the recommended prediction
best = predictor.get_best_prediction(predictions)
if 'weights' in predictions:
w = predictions[ 'weights' ]
print ( f "XGBoost: { w[ 'xgboost' ] :.0%} , Prophet: { w[ 'prophet' ] :.0%} " )
best = predictor.get_best_prediction(predictions)
# Check if prediction within reasonable bounds
current_price = df[ 'close' ].iloc[ - 1 ]
upper = best[ 'upper_bound' ].iloc[ 0 ]
lower = best[ 'lower_bound' ].iloc[ 0 ]
if lower <= current_price <= upper:
print ( "Prediction appears reasonable" )
# Retrain with latest data
new_df = fetch_latest_data() # Your data fetching function
predictor.train(new_df)
# Recommended: Retrain daily for production systems
Advantages of Hybrid Approach
Optimal for All Horizons: No need to manually choose models
Smooth Transitions: Weighted ensemble avoids prediction jumps
Best of Both Worlds: Combines XGBoost’s accuracy with Prophet’s trends
Confidence Intervals: Provides uncertainty estimates from both models
Automatic Selection: Intelligent model routing based on horizon
Limitations
Training Time: Must train both models (2x training time)
Memory Usage: Stores two models in memory
Complexity: More complex than single-model approach
Ensemble Zone: May not always improve over best individual model
When to Use Hybrid vs Individual Models
Use Hybrid
Variable prediction horizons
Want automatic model selection
Need confidence intervals
Production systems
General-purpose forecasting
Use Individual Models
Fixed short-term horizon (use XGBoost)
Fixed long-term horizon (use Prophet)
Memory constraints
Faster training needed
Research and experimentation
Quick Start Template
from models.hybrid_model import HybridCryptoPredictor
import pandas as pd
# 1. Initialize
predictor = HybridCryptoPredictor()
# 2. Load data
df = pd.read_csv( 'data.csv' , index_col = 'timestamp' , parse_dates = True )
# 3. Train
print ( "Training models..." )
metrics = predictor.train(df)
print ( "✓ Training complete" )
# 4. Predict
periods = 48 # 2 days
predictions = predictor.predict_future(df, periods = periods)
# 5. Get best prediction
best = predictor.get_best_prediction(predictions)
# 6. Display results
print ( f " \n Forecast for next { periods } hours:" )
print ( f "Model used: { predictions[ 'recommended' ] } " )
print ( f "Next price: $ { best[ 'predicted_price' ].iloc[ 0 ] :,.2f} " )
print ( f "Range: $ { best[ 'lower_bound' ].iloc[ 0 ] :,.2f} - $ { best[ 'upper_bound' ].iloc[ 0 ] :,.2f} " )
Next Steps
XGBoost Model Deep dive into short-term predictions
Prophet Model Deep dive into long-term forecasting
Model Comparison Detailed comparison of all three models
API Reference Complete API documentation