Skip to main content

Overview

CryptoView Pro employs three specialized machine learning models, each optimized for different forecasting horizons. The system intelligently selects the best model based on the prediction timeframe, or combines them using a hybrid approach.

XGBoost

1-72 hoursFast, accurate short-term predictions using gradient boosting

Prophet

1 week - 1 yearTrend and seasonality detection for long-term forecasts

Hybrid

AdaptiveWeighted ensemble combining both models

Model Selection Logic

The system recommends models based on forecast horizon:
# Recommendation logic
if forecast_hours <= 72:
    recommended_model = "XGBoost"  # Short-term precision
elif forecast_hours <= 720:  # 30 days
    recommended_model = "Hybrid"   # Balanced approach
else:
    recommended_model = "Prophet"  # Long-term trends
Users can override the recommendation and manually select any model through the sidebar.

XGBoost Model

Location: models/xgboost_model.py

Architecture

XGBoost (eXtreme Gradient Boosting) uses an ensemble of decision trees trained sequentially, with each tree correcting errors from previous trees.
Best for: Intraday and swing trading (1-72 hours)Strengths: Fast training, handles non-linear patterns, robust to outliersLimitations: Performance degrades beyond 3 days

Model Configuration

models/xgboost_model.py
class XGBoostCryptoPredictor:
    def __init__(self, 
                 n_estimators: int = 200,      # Number of trees
                 learning_rate: float = 0.07,   # Step size shrinkage
                 max_depth: int = 6,            # Tree depth
                 subsample: float = 0.8,        # Row sampling
                 colsample_bytree: float = 0.8): # Column sampling
        
        self.model = xgb.XGBRegressor(
            n_estimators=n_estimators,
            learning_rate=learning_rate,
            max_depth=max_depth,
            subsample=subsample,
            colsample_bytree=colsample_bytree,
            objective='reg:squarederror',  # MSE loss
            random_state=42,
            n_jobs=-1  # Use all CPU cores
        )

Feature Engineering

XGBoost’s strength lies in its 60+ engineered features:
Multi-timeframe percentage changes:
df['return_1'] = df['close'].pct_change(1)      # 1-hour return
df['return_4'] = df['close'].pct_change(4)      # 4-hour return
df['return_24'] = df['close'].pct_change(24)    # 24-hour return
df['return_168'] = df['close'].pct_change(168)  # 7-day return
Simple moving averages and price ratios:
for window in [7, 14, 30, 50]:
    df[f'ma_{window}'] = df['close'].rolling(window=window).mean()
    df[f'price_to_ma_{window}'] = df['close'] / df[f'ma_{window}']
Weighted averages giving more importance to recent prices:
for span in [12, 26, 50]:
    df[f'ema_{span}'] = df['close'].ewm(span=span, adjust=False).mean()
Rolling standard deviation of returns:
for window in [7, 14, 30]:
    df[f'volatility_{window}'] = df['return_1'].rolling(window=window).std()
Price change over fixed periods:
df['momentum_7'] = df['close'] - df['close'].shift(7)
df['momentum_14'] = df['close'] - df['close'].shift(14)
Volatility bands and price position:
df['bb_middle'] = df['close'].rolling(window=20).mean()
bb_std = df['close'].rolling(window=20).std()
df['bb_upper'] = df['bb_middle'] + (bb_std * 2)
df['bb_lower'] = df['bb_middle'] - (bb_std * 2)
df['bb_position'] = (df['close'] - df['bb_lower']) / (df['bb_upper'] - df['bb_lower'])
Trading volume patterns:
df['volume_ma_7'] = df['volume'].rolling(window=7).mean()
df['volume_ratio'] = df['volume'] / df['volume_ma_7']
df['volume_change'] = df['volume'].pct_change(1)
Intrabar price relationships:
df['high_low_ratio'] = df['high'] / df['low']
df['close_open_ratio'] = df['close'] / df['open']
Time-based cyclical patterns:
df['hour'] = df.index.hour              # Hour of day (0-23)
df['day_of_week'] = df.index.dayofweek  # Day of week (0-6)
df['day_of_month'] = df.index.day       # Day of month (1-31)
df['month'] = df.index.month            # Month (1-12)
Momentum oscillator features:
df['rsi_normalized'] = df['rsi'] / 100
df['rsi_oversold'] = (df['rsi'] < 30).astype(int)
df['rsi_overbought'] = (df['rsi'] > 70).astype(int)
Trend following features:
df['macd_diff'] = df['macd'] - df['macd_signal']
df['macd_positive'] = (df['macd_diff'] > 0).astype(int)
Past price values:
for lag in [1, 2, 3, 7, 14]:
    df[f'close_lag_{lag}'] = df['close'].shift(lag)
Total: 60+ features capturing price dynamics, trends, volatility, and market microstructure.

Training Process

1

Data Preparation

# Time series split (respects temporal order)
X_train, X_test, y_train, y_test = prepare_data(df, train_size=0.8)

# Feature scaling with MinMaxScaler
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
2

Model Training

model.fit(
    X_train, y_train,
    eval_set=[(X_test, y_test)],  # Validation during training
    verbose=False
)
3

Evaluation Metrics

Multiple metrics assess model quality:
  • MAE (Mean Absolute Error): Average prediction error in dollars
  • RMSE (Root Mean Squared Error): Penalizes large errors
  • MAPE (Mean Absolute Percentage Error): Error as percentage
  • Direction Accuracy: % of correct up/down predictions

Prediction Generation

Recursive multi-step forecasting:
def predict_future(df, periods=24):
    predictions = []
    df_work = df.copy()
    
    for i in range(periods):
        # 1. Create features from current data
        df_features = create_features(df_work)
        
        # 2. Get last row and scale
        last_row = df_features[feature_columns].iloc[-1:]
        last_row_scaled = scaler.transform(last_row)
        
        # 3. Predict next price
        pred = model.predict(last_row_scaled)[0]
        predictions.append(pred)
        
        # 4. Add prediction to data for next iteration
        new_row = create_synthetic_row(pred)
        df_work = pd.concat([df_work, new_row])
    
    return predictions
Recursive forecasting can accumulate errors over time. This is why XGBoost is best for horizons ≤72 hours.

Confidence Intervals

Prediction uncertainty is estimated using statistical methods:
def create_prediction_intervals(predictions, confidence=0.95):
    std_estimate = predictions['predicted_price'].std()
    z_score = stats.norm.ppf((1 + confidence) / 2)  # 1.96 for 95%
    margin = z_score * std_estimate
    
    predictions['lower_bound'] = predictions['predicted_price'] - margin
    predictions['upper_bound'] = predictions['predicted_price'] + margin
    
    return predictions

Feature Importance

XGBoost tracks which features are most predictive:
importance_df = predictor.get_feature_importance()
# Returns: DataFrame sorted by importance
# Top features typically: close_lag_1, ema_12, volume_ratio, rsi

Prophet Model

Location: models/prophet_model.py

Architecture

Prophet (developed by Meta/Facebook) decomposes time series into: y(t) = trend(t) + seasonality(t) + holidays(t) + error(t)
Best for: Medium to long-term forecasting (1 week - 1 year)Strengths: Handles trends, seasonality, missing data, outliersLimitations: Less accurate for short-term fluctuations

Model Configuration

models/prophet_model.py
class ProphetCryptoPredictor:
    def __init__(self, 
                 changepoint_prior_scale: float = 0.5,
                 seasonality_prior_scale: float = 10,
                 interval_width: float = 0.95):
        
        self.model = Prophet(
            changepoint_prior_scale=0.5,      # Trend flexibility
            seasonality_prior_scale=10,       # Seasonality strength
            interval_width=0.95,              # Confidence intervals
            daily_seasonality=True,           # 24-hour patterns
            weekly_seasonality=True,          # 7-day patterns
            yearly_seasonality=False,         # Not relevant for crypto
            seasonality_mode='multiplicative' # Scales with trend
        )

Key Parameters Explained

changepoint_prior_scale
float
default:"0.5"
Controls trend flexibility. Higher values (0.5) allow more dramatic trend changes, suitable for volatile crypto markets.
  • Low (0.001-0.05): Smooth, stable trends
  • Medium (0.05-0.5): Moderate flexibility
  • High (0.5-1.0): Highly flexible, follows volatility
seasonality_prior_scale
float
default:"10"
Strength of seasonal components. Higher values (10) detect stronger daily/weekly patterns.
  • Crypto markets have weak seasonality compared to traditional markets
  • Value of 10 balances pattern detection with noise filtering
seasonality_mode
string
default:"multiplicative"
How seasonality scales with the trend:
  • Additive: Seasonality has constant amplitude
  • Multiplicative: Seasonality scales proportionally with price (better for crypto)

Data Preparation

Prophet requires specific DataFrame format:
def prepare_data(df):
    prophet_df = pd.DataFrame({
        'ds': df.index,        # Datetime column (required name)
        'y': df['close']       # Target variable (required name)
    })
    return prophet_df.dropna()

Training Process

1

Data Validation

Prophet requires minimum 100 data points and no missing values in the target.
2

Model Fitting

prophet_df = prepare_data(df)
model.fit(prophet_df)  # Automatically detects changepoints and seasonality
Prophet’s Stan-based backend performs MAP (Maximum A Posteriori) estimation.
3

In-Sample Evaluation

forecast = model.predict(prophet_df)

# Metrics calculated on training data
mae = np.mean(np.abs(actual - forecast['yhat']))
direction_accuracy = accuracy of up/down predictions

Prediction Generation

Prophet makes predictions all at once (not recursively):
def predict_future(periods, freq='H'):
    # 1. Create future datetime index
    future = model.make_future_dataframe(periods=periods, freq=freq)
    
    # 2. Generate forecast
    forecast = model.predict(future)
    
    # 3. Extract future predictions only
    future_forecast = forecast[forecast['ds'] > last_train_date]
    
    return pd.DataFrame({
        'timestamp': future_forecast['ds'],
        'predicted_price': future_forecast['yhat'],
        'lower_bound': future_forecast['yhat_lower'],  # Built-in CI
        'upper_bound': future_forecast['yhat_upper'],
        'trend': future_forecast['trend']              # Isolated trend
    })

Forecast Components

Prophet provides interpretable decomposition:
  • yhat: Final prediction (trend + seasonality)
  • trend: Long-term direction
  • daily: Daily seasonal component
  • weekly: Weekly seasonal component
  • yhat_lower/upper: Uncertainty intervals

Backtesting

Time series cross-validation:
def backtest_prophet(df, predictor, test_periods=168):
    # 1. Split data
    train_df = df.iloc[:-test_periods]
    test_df = df.iloc[-test_periods:]
    
    # 2. Train on historical data
    predictor.train(train_df)
    
    # 3. Predict test period
    predictions = predictor.predict_future(periods=test_periods)
    
    # 4. Compare with actual
    actual = test_df['close'].values
    predicted = predictions['predicted_price'].values
    
    # 5. Calculate metrics
    return {
        'mae': mean_absolute_error,
        'direction_accuracy': % correct direction
    }

Hybrid Model

Location: models/hybrid_model.py

Concept

The Hybrid model intelligently combines XGBoost and Prophet predictions using dynamic weighting based on forecast horizon.
Best for: All time horizonsStrengths: Adaptive, leverages both models’ strengthsApproach: Ensemble learning with time-based weights

Architecture

models/hybrid_model.py
class HybridCryptoPredictor:
    def __init__(self):
        self.xgboost = XGBoostCryptoPredictor()
        self.prophet = ProphetCryptoPredictor()
        self.trained = False

Training Process

Both models train independently:
def train(df):
    # Train XGBoost
    xgb_metrics = self.xgboost.train(df, train_size=0.8)
    
    # Train Prophet
    prophet_metrics = self.prophet.train(df)
    
    return {
        'xgboost': xgb_metrics,
        'prophet': prophet_metrics,
        'data_points': len(df)
    }

Dynamic Weighting Algorithm

Weight calculation based on forecast horizon:
def predict_future(df, periods):
    predictions = {}
    
    # XGBoost for short-term (≤72h)
    if periods <= 72:
        predictions['xgboost'] = xgboost.predict_future(df, periods)
        predictions['recommended'] = 'xgboost'
    
    # Prophet for medium/long-term (>24h)
    if periods > 24:
        predictions['prophet'] = prophet.predict_future(periods, freq='H')
        if periods > 72:
            predictions['recommended'] = 'prophet'
    
    # Create weighted ensemble if both available
    if 'xgboost' in predictions and 'prophet' in predictions:
        # Weight calculation
        xgb_weight = max(0, 1 - (periods / 168))
        prophet_weight = 1 - xgb_weight
        
        # Weighted average
        combined['predicted_price'] = (
            xgb_pred * xgb_weight + prophet_pred * prophet_weight
        )
        
        predictions['hybrid'] = combined
        predictions['weights'] = {'xgboost': xgb_weight, 'prophet': prophet_weight}
        predictions['recommended'] = 'hybrid'
    
    return predictions

Weighting Examples

24 Hours

XGBoost: 86%Prophet: 14%Short-term, favor XGBoost

7 Days (168h)

XGBoost: 0%Prophet: 100%Transition complete

72 Hours

XGBoost: 57%Prophet: 43%Balanced ensemble

Confidence Interval Blending

Uncertainty bounds are also weighted:
combined['lower_bound'] = (
    xgb['lower_bound'] * xgb_weight + 
    prophet['lower_bound'] * prophet_weight
)

combined['upper_bound'] = (
    xgb['upper_bound'] * xgb_weight + 
    prophet['upper_bound'] * prophet_weight
)

Advantages

Seamless Transition: Smoothly transitions from XGBoost to Prophet as horizon increases
Best of Both: Captures short-term patterns AND long-term trends
Reduced Error: Ensemble typically outperforms individual models
Robust: If one model fails, the other provides backup

Model Comparison

XGBoost
Training Time: Fast (1-2 seconds)Prediction Time: Fast (milliseconds)Memory Usage: ModerateAccuracy: High for short-termInterpretability: Medium (feature importance)
Prophet
Training Time: Moderate (5-10 seconds)Prediction Time: Fast (milliseconds)Memory Usage: LowAccuracy: High for long-termInterpretability: High (decomposable components)
Hybrid
Training Time: Moderate (sum of both)Prediction Time: Fast (minimal overhead)Memory Usage: High (both models loaded)Accuracy: Best overallInterpretability: Medium (blended)

Performance Metrics

All models track these metrics:

MAE

Mean Absolute ErrorAverage prediction error in dollarsLower is better

RMSE

Root Mean Squared ErrorEmphasizes large errorsLower is better

MAPE

Mean Absolute Percentage ErrorError as percentage of actual priceLower is better

Direction Accuracy

Directional Prediction Accuracy% of correct up/down predictionsHigher is better

Next Steps

Technical Indicators

Learn about the indicators that feed into the models

Architecture

Understand how models fit into the overall system

Build docs developers (and LLMs) love