Hybrid Model

Overview

The Hybrid model intelligently combines XGBoost (short-term) and Prophet (long-term) to provide the best prediction for any time horizon. It automatically selects or blends models based on the forecast period. Best for: All-purpose predictions from 1 hour to 1 month Source: source/models/hybrid_model.py

How It Works

Train Both Models

Trains XGBoost and Prophet on your historical data in parallel

Evaluate Horizon

Determines prediction timeframe (short vs long term)

Select or Blend

≤72 hours: XGBoost only
24-72 hours: Both models with weighted ensemble
>72 hours: Prophet only or ensemble

Return Best Prediction

Provides the optimal prediction with confidence intervals

Model Selection Logic

The hybrid model uses different strategies based on prediction horizon:

if periods <= 24:
    # Short-term: XGBoost only
    use_model = "xgboost"
    
elif 24 < periods <= 72:
    # Medium-term: Ensemble both models
    xgb_weight = 1 - (periods / 168)
    prophet_weight = 1 - xgb_weight
    use_model = "hybrid"
    
else:  # periods > 72
    # Long-term: Prophet dominant
    use_model = "hybrid" or "prophet"

The weighting formula gradually transitions from XGBoost to Prophet as the horizon increases.

Usage Example

import pandas as pd
from models.hybrid_model import HybridCryptoPredictor

# Initialize hybrid model
predictor = HybridCryptoPredictor()

# Load historical data
df = pd.read_csv('btc_hourly.csv', index_col='timestamp', parse_dates=True)

# Train both models at once
training_info = predictor.train(df)

print("XGBoost metrics:", training_info['xgboost'])
print("Prophet metrics:", training_info['prophet'])
print(f"Trained on {training_info['data_points']} data points")

# Predict next 48 hours (will use ensemble)
predictions = predictor.predict_future(df, periods=48)

print(f"Recommended model: {predictions['recommended']}")
print(f"Weights: {predictions.get('weights', 'N/A')}")

# Get best prediction
best = predictor.get_best_prediction(predictions)
print(best[['predicted_price', 'lower_bound', 'upper_bound']])

Prediction Output

The predict_future() method returns a dictionary with multiple predictions:

{
    'xgboost': pd.DataFrame,      # XGBoost predictions (if applicable)
    'prophet': pd.DataFrame,      # Prophet predictions (if applicable)
    'hybrid': pd.DataFrame,       # Ensemble predictions (if applicable)
    'recommended': str,           # Which model to use: 'xgboost', 'prophet', or 'hybrid'
    'weights': {                  # Ensemble weights (if hybrid)
        'xgboost': float,
        'prophet': float
    }
}

DataFrame Structure

Each prediction DataFrame contains:

Column	Source	Description
`predicted_price`	Both	Point estimate
`lower_bound`	Both	Lower confidence bound (95%)
`upper_bound`	Both	Upper confidence bound (95%)
`trend`	Prophet only	Trend component

Ensemble Weighting Formula

For periods between 24-72 hours, the hybrid model blends predictions:

# Calculate weights based on horizon
xgb_weight = max(0, 1 - (periods / 168))
prophet_weight = 1 - xgb_weight

# Weighted average of predictions
hybrid_price = (xgb_pred * xgb_weight) + (prophet_pred * prophet_weight)

Example Weights

Periods	Hours	XGBoost Weight	Prophet Weight	Strategy
12	12h	93%	7%	Mostly XGBoost
24	1d	86%	14%	Mostly XGBoost
48	2d	71%	29%	XGBoost dominant
72	3d	57%	43%	Balanced
96	4d	43%	57%	Prophet dominant
168	7d	0%	100%	Pure Prophet

The weighting smoothly transitions from XGBoost to Prophet, avoiding sudden jumps in predictions.

Training Process

Sequential Training

The hybrid model trains both sub-models:

def train(self, df: pd.DataFrame) -> Dict:
    # 1. Train XGBoost (80/20 split)
    xgb_metrics = self.xgboost.train(df, train_size=0.8)
    
    # 2. Train Prophet (all data)
    prophet_metrics = self.prophet.train(df)
    
    # 3. Return combined metrics
    return {
        'xgboost': xgb_metrics,
        'prophet': prophet_metrics,
        'data_points': len(df)
    }

Training Time

XGBoost: 5-15 seconds (1000 data points)
Prophet: 10-30 seconds (1000 data points)
Total: ~15-45 seconds for full hybrid training

Training happens once. After training, predictions are fast regardless of horizon.

Getting the Best Prediction

Use get_best_prediction() to automatically select the optimal forecast:

# Predict multiple horizons
short_term = predictor.predict_future(df, periods=12)
medium_term = predictor.predict_future(df, periods=48) 
long_term = predictor.predict_future(df, periods=168)

# Get best for each
best_short = predictor.get_best_prediction(short_term)   # Uses XGBoost
best_medium = predictor.get_best_prediction(medium_term) # Uses Hybrid
best_long = predictor.get_best_prediction(long_term)     # Uses Prophet

This method implements the selection logic:

def get_best_prediction(self, predictions: Dict) -> pd.DataFrame:
    if 'hybrid' in predictions:
        return predictions['hybrid']
    elif predictions.get('recommended') == 'xgboost':
        return predictions['xgboost']
    else:
        return predictions['prophet']

Confidence Intervals

The hybrid model preserves confidence intervals from both models:

# XGBoost intervals (statistical estimation)
from models.xgboost_model import create_prediction_intervals
xgb_with_intervals = create_prediction_intervals(xgb_predictions)

# Prophet intervals (native to Prophet)
prophet_predictions  # Already includes lower_bound and upper_bound

# Hybrid intervals (weighted average of both)
hybrid['lower_bound'] = (
    xgb['lower_bound'] * xgb_weight + 
    prophet['lower_bound'] * prophet_weight
)
hybrid['upper_bound'] = (
    xgb['upper_bound'] * xgb_weight + 
    prophet['upper_bound'] * prophet_weight
)

Confidence intervals widen as the prediction horizon increases, reflecting greater uncertainty.

Use Cases by Horizon

1-24 Hours
1-3 Days
1 Week+

Recommended: XGBoost only

predictions = predictor.predict_future(df, periods=24)
# predictions['recommended'] == 'xgboost'

Why: XGBoost has superior accuracy for short-term predictions using recent price action and technical indicators.

Recommended: Hybrid ensemble

predictions = predictor.predict_future(df, periods=48)
# predictions['recommended'] == 'hybrid'
# predictions['weights'] == {'xgboost': 0.71, 'prophet': 0.29}

Why: Combines XGBoost’s short-term accuracy with Prophet’s trend detection.

Recommended: Prophet only

predictions = predictor.predict_future(df, periods=168)
# predictions['recommended'] == 'prophet'

Why: Prophet excels at long-term trends and seasonality. XGBoost error accumulates over time.

Advanced: Accessing All Predictions

Inspect predictions from both models individually:

predictions = predictor.predict_future(df, periods=48)

# Access individual model predictions
if 'xgboost' in predictions:
    xgb_df = predictions['xgboost']
    print("XGBoost prediction:", xgb_df['predicted_price'].iloc[0])

if 'prophet' in predictions:
    prophet_df = predictions['prophet']
    print("Prophet prediction:", prophet_df['predicted_price'].iloc[0])
    print("Prophet trend:", prophet_df['trend'].iloc[0])

if 'hybrid' in predictions:
    hybrid_df = predictions['hybrid']
    print("Hybrid prediction:", hybrid_df['predicted_price'].iloc[0])

Comparison Example

import matplotlib.pyplot as plt

# Get predictions for 72 hours
predictions = predictor.predict_future(df, periods=72)

xgb = predictions['xgboost']['predicted_price']
prophet = predictions['prophet']['predicted_price']
hybrid = predictions['hybrid']['predicted_price']

# Plot all three
plt.figure(figsize=(12, 6))
plt.plot(xgb.index, xgb.values, label='XGBoost', alpha=0.7)
plt.plot(prophet.index, prophet.values, label='Prophet', alpha=0.7)
plt.plot(hybrid.index, hybrid.values, label='Hybrid', linewidth=2)
plt.legend()
plt.title('Model Comparison: 72-hour Forecast')
plt.show()

Performance Characteristics

Metric	XGBoost	Prophet	Hybrid
Short-term (1-24h)	⭐⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐⭐
Medium-term (1-3d)	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐
Long-term (1w+)	⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Training Speed	Fast	Slow	Slow
Prediction Speed	Slow (iterative)	Fast	Fast
Confidence Intervals	Estimated	Native	Both
Interpretability	Low	High	Medium

Training Information

Retrieve detailed training metrics:

training_info = predictor.train(df)

# XGBoost metrics
print("\nXGBoost Performance:")
print(f"  Test MAPE: {training_info['xgboost']['test_mape']:.2f}%")
print(f"  Direction Accuracy: {training_info['xgboost']['test_direction_accuracy']:.2f}%")

# Prophet metrics  
print("\nProphet Performance:")
print(f"  MAPE: {training_info['prophet']['mape']:.2f}%")
print(f"  Direction Accuracy: {training_info['prophet']['direction_accuracy']:.2f}%")

# Data info
print(f"\nTrained on {training_info['data_points']} data points")

# Get detailed info
detailed = predictor.get_training_info()

Best Practices

Always Check Recommended Model

predictions = predictor.predict_future(df, periods=48)
print(f"Use: {predictions['recommended']}")

# Use the recommended prediction
best = predictor.get_best_prediction(predictions)

Inspect Ensemble Weights

if 'weights' in predictions:
    w = predictions['weights']
    print(f"XGBoost: {w['xgboost']:.0%}, Prophet: {w['prophet']:.0%}")

Use Confidence Intervals

best = predictor.get_best_prediction(predictions)

# Check if prediction within reasonable bounds
current_price = df['close'].iloc[-1]
upper = best['upper_bound'].iloc[0]
lower = best['lower_bound'].iloc[0]

if lower <= current_price <= upper:
    print("Prediction appears reasonable")

Retrain Periodically

# Retrain with latest data
new_df = fetch_latest_data()  # Your data fetching function
predictor.train(new_df)

# Recommended: Retrain daily for production systems

Advantages of Hybrid Approach

Optimal for All Horizons: No need to manually choose models
Smooth Transitions: Weighted ensemble avoids prediction jumps
Best of Both Worlds: Combines XGBoost’s accuracy with Prophet’s trends
Confidence Intervals: Provides uncertainty estimates from both models
Automatic Selection: Intelligent model routing based on horizon

Limitations

Training Time: Must train both models (2x training time)
Memory Usage: Stores two models in memory
Complexity: More complex than single-model approach
Ensemble Zone: May not always improve over best individual model

When to Use Hybrid vs Individual Models

Use Hybrid

Variable prediction horizons
Want automatic model selection
Need confidence intervals
Production systems
General-purpose forecasting

Use Individual Models

Fixed short-term horizon (use XGBoost)
Fixed long-term horizon (use Prophet)
Memory constraints
Faster training needed
Research and experimentation

Quick Start Template

from models.hybrid_model import HybridCryptoPredictor
import pandas as pd

# 1. Initialize
predictor = HybridCryptoPredictor()

# 2. Load data
df = pd.read_csv('data.csv', index_col='timestamp', parse_dates=True)

# 3. Train
print("Training models...")
metrics = predictor.train(df)
print("✓ Training complete")

# 4. Predict
periods = 48  # 2 days
predictions = predictor.predict_future(df, periods=periods)

# 5. Get best prediction
best = predictor.get_best_prediction(predictions)

# 6. Display results
print(f"\nForecast for next {periods} hours:")
print(f"Model used: {predictions['recommended']}")
print(f"Next price: ${best['predicted_price'].iloc[0]:,.2f}")
print(f"Range: ${best['lower_bound'].iloc[0]:,.2f} - ${best['upper_bound'].iloc[0]:,.2f}")

Next Steps

XGBoost Model

Deep dive into short-term predictions

Prophet Model

Deep dive into long-term forecasting

Model Comparison

Detailed comparison of all three models

API Reference

Complete API documentation

Get Started

Core Concepts

ML Models

Features

Configuration

Overview

How It Works

Model Selection Logic

Usage Example

Prediction Output

DataFrame Structure

Ensemble Weighting Formula

Example Weights

Training Process

Sequential Training

Training Time

Getting the Best Prediction

Confidence Intervals

Use Cases by Horizon

Advanced: Accessing All Predictions

Comparison Example

Performance Characteristics

Training Information

Best Practices

Advantages of Hybrid Approach

Limitations

When to Use Hybrid vs Individual Models

Use Hybrid

Use Individual Models

Quick Start Template

Next Steps

XGBoost Model

Prophet Model

Model Comparison

API Reference

Build docs developers (and LLMs) love

Get Started

Core Concepts

ML Models

Features

Configuration

​Overview

​How It Works

​Model Selection Logic

​Usage Example

​Prediction Output

​DataFrame Structure

​Ensemble Weighting Formula

​Example Weights

​Training Process

​Sequential Training

​Training Time

​Getting the Best Prediction

​Confidence Intervals

​Use Cases by Horizon

​Advanced: Accessing All Predictions

​Comparison Example

​Performance Characteristics

​Training Information

​Best Practices

​Advantages of Hybrid Approach

​Limitations

​When to Use Hybrid vs Individual Models

Use Hybrid

Use Individual Models

​Quick Start Template

​Next Steps

XGBoost Model

Prophet Model

Model Comparison

API Reference

Build docs developers (and LLMs) love

Overview

How It Works

Model Selection Logic

Usage Example

Prediction Output

DataFrame Structure

Ensemble Weighting Formula

Example Weights

Training Process

Sequential Training

Training Time

Getting the Best Prediction

Confidence Intervals

Use Cases by Horizon

Advanced: Accessing All Predictions

Comparison Example

Performance Characteristics

Training Information

Best Practices

Advantages of Hybrid Approach

Limitations

When to Use Hybrid vs Individual Models

Quick Start Template

Next Steps