Tuning Parameters - Polymarket Bot

The Polymarket Bot’s prediction quality and risk-adjusted returns are highly sensitive to a handful of critical parameters. This guide walks through the tuning process, explains parameter interactions, and provides practical recommendations based on market behavior.

Tuning Philosophy

Golden rule: Tune on backtests, validate on live data, never optimize on the same data twice.

The bot uses a multi-model system (Black-Scholes + momentum + reversion) combined in logit space. Each component has tunable parameters, and they interact in non-obvious ways:

EWMA lambda controls volatility estimation smoothness
Logit weights control how momentum and reversion adjust base probability
Abstention thresholds control when the model has enough edge to trade
Risk parameters control position sizing and drawdown protection

EWMA Lambda (Volatility Estimation)

What It Does

The EWMA lambda parameter (engine.ewma.lambda) controls the decay rate for exponentially weighted volatility estimation. Higher lambda = more weight on recent observations = smoother volatility that reacts slowly to shocks. Formula: variance = lambda * variance + (1 - lambda) * (r^2 / dt)

Default Value

"engine": {
  "ewma": {
    "lambda": 0.94
  }
}

Tuning Range

Lambda	Behavior	When to Use
0.90-0.92	Fast adaptation, noisy	High-frequency volatility regime (flash crashes, major news)
0.93-0.94	Balanced (default)	Normal market conditions
0.95-0.96	Slow adaptation, smooth	Low-frequency trends, stable markets

Tuning Process

Collect tick data over at least 100 intervals (8+ hours)
Compute realized volatility per interval
Run backtests with lambda in [0.90, 0.92, 0.94, 0.96]
Evaluate:
- Brier Score (calibration quality)
- Sharpe Ratio (risk-adjusted returns)
- Volatility outlier frequency (how often sigma exceeds sigmaMultiplier * meanSigma)

Example

# Backtest with lambda=0.92 (faster adaptation)
node src/backtest.js --lambda 0.92 --data data/ticks-2026-03-04.jsonl

# Compare Brier Score and Sharpe
node src/reporter/daily.js 2026-03-04

Start with 0.94. If you see frequent abstentions due to anomalous_regime during normal market conditions, increase lambda to 0.96. If the model misses rapid volatility shifts, decrease to 0.92.

Logit Weights (Momentum & Reversion)

What They Do

The logit weights control how momentum ROC and mean reversion signals adjust the Black-Scholes base probability in log-odds space. Formula: finalProb = sigmoid(logit(baseProb) + w_momentum * ROC + w_reversion * deviation) See src/engine/predictor.js:128-132.

Default Values

"engine": {
  "prediction": {
    "logitMomentumWeight": 150,
    "logitReversionWeight": 80
  }
}

Parameter Interaction

These weights are not independent. The ratio between them determines the model’s behavior:

Ratio	Behavior	Risk
`w_momentum / w_reversion > 2.5`	Trend-following (chases momentum)	Misses reversals, gets whipsawed
`w_momentum / w_reversion ≈ 2.0`	Balanced (default: 150/80 = 1.875)	Good for mixed markets
`w_momentum / w_reversion < 1.5`	Mean-reversion (fades moves)	Misses breakouts, fights trends

Tuning Range

Parameter	Range	Default
`logitMomentumWeight`	50 - 300	150
`logitReversionWeight`	40 - 150	80

Tuning Process

Fix one weight, sweep the other

# Fix reversion at 80, sweep momentum from 100 to 200 in steps of 20
for w in 100 120 140 160 180 200; do
  node src/backtest.js --momentum-weight $w --reversion-weight 80
done

Evaluate on held-out intervals:
- Brier Score (lower is better)
- Log Loss (lower is better)
- Early 1m accuracy (higher is better, target >80%)
- EV per trade (higher is better)
Check calibration:
```
node src/reporter/daily.js 2026-03-04 | grep "Murphy"
```
Look for low resolution loss (MRES) and low calibration loss (MCAL).

Recommended Starting Points

{
  "logitMomentumWeight": 180,
  "logitReversionWeight": 70
}

Do not tune on recent live data. You will overfit. Use data from at least 7 days ago, tune on 70% of intervals, validate on 30%.

Abstention Thresholds

The abstention system prevents trading when the model has no edge. There are 6 configurable conditions:

Critical Thresholds

engine.abstention.deadZone

number

default:"0.10"

Most impactful parameter. Controls the minimum edge before making a prediction.

Decrease (0.05-0.08) → Trade more often, accept smaller edges, higher risk of false signals
Increase (0.12-0.15) → Trade less often, require stronger edges, miss opportunities

Tuning goal: Maximize (EV per trade) * (trade frequency)

engine.abstention.minEV

number

default:"0.05"

Minimum expected value to place a bet. Works in conjunction with minMargin.Formula: EV = (p / q) - 1

Decrease (0.03) → Accept smaller edges, more trades
Increase (0.08-0.10) → Only trade high-EV opportunities

engine.abstention.minMargin

number

default:"0.15"

Minimum edge in percentage points (|p - q| >= minMargin).

Decrease (0.10-0.12) → Trade on smaller edges
Increase (0.18-0.20) → Require wider edges, fewer trades

Tuning Process

Start with defaults (deadZone=0.10, minEV=0.05, minMargin=0.15)
Collect 200+ intervals of live data

Analyze abstention reasons:

grep '"abstained":true' data/history.json | jq .reason | sort | uniq -c

Compute realized performance by condition:
- If many insufficient_margin abstentions and high Brier Score → increase minMargin
- If few trades and model is well-calibrated → decrease deadZone or minMargin
- If many trades with negative EV → increase minEV

Example Analysis

# Count abstention reasons
$ grep '"abstained":true' data/history.json | jq -r .reason | sort | uniq -c
  12 anomalous_regime
  45 dead_zone
   3 insufficient_data
  18 insufficient_margin
   5 cold_streak

# Interpretation:
# - 45 dead_zone: Base probability near 50%, no directional edge
# - 18 insufficient_margin: Model has edge but not enough to overcome noise
# - 12 anomalous_regime: High volatility periods

# If accuracy on traded intervals is >85%, consider reducing deadZone to 0.08
# If accuracy is <75%, increase minMargin to 0.18

Conservative tuning: Start with deadZone=0.12, minEV=0.06, minMargin=0.18. This will trade less often but with higher conviction. Lower thresholds as your model proves calibration quality.

Risk Parameters

Brier Tiers (Kelly Fraction)

The bot uses dynamic Kelly fractions based on calibration quality:

"risk": {
  "brierTiers": [
    { "maxBrier": null, "minPredictions": 0, "maxPredictions": 100, "alpha": 0 },
    { "maxBrier": 1.0, "minPredictions": 100, "alpha": 0.10 },
    { "maxBrier": 0.26, "minPredictions": 100, "alpha": 0.20 },
    { "maxBrier": 0.22, "minPredictions": 100, "alpha": 0.25 },
    { "maxBrier": 0.18, "minPredictions": 100, "alpha": 0.40 }
  ]
}

Interpretation:

Tier 0: No trading until 100 predictions collected
Tier 1: Poor calibration (Brier > 0.26) → 10% Kelly (very conservative)
Tier 2: Decent calibration (0.22-0.26) → 20% Kelly
Tier 3: Good calibration (0.18-0.22) → 25% Kelly
Tier 4: Excellent calibration (< 0.18) → 40% Kelly

Tuning Alpha Values

Do not increase alpha beyond 0.50 (half-Kelly). Full Kelly is aggressive and can lead to large drawdowns. Even 50% Kelly assumes your edge estimate is perfectly accurate.

If your model achieves Brier < 0.18 consistently:

Conservative: Keep alpha at 0.40 (default tier 4)
Moderate: Increase to 0.50
Aggressive: Consider 0.60, but monitor drawdown closely

Max Bet Percentage

"risk": {
  "maxBetPct": 0.05
}

Caps individual bets at 5% of bankroll regardless of Kelly calculation. Tuning:

Conservative: 0.03 (3% max bet)
Default: 0.05 (5% max bet)
Aggressive: 0.08 (8% max bet, not recommended)

Drawdown Thresholds

"risk": {
  "drawdown": {
    "yellowPct": 0.10,
    "redPct": 0.20,
    "criticalPct": 0.30
  }
}

Effects:

Yellow (10%): Warning only, no bet sizing impact
Red (20%): Alpha multiplier = 0.5 (half-size bets)
Critical (30%): Trading suspended

Tuning:

If you hit red frequently: Lower alpha tiers or increase abstention thresholds
If you never hit yellow: Your bet sizing may be too conservative (missing edge)

Practical Tuning Workflow

Phase 1: Collect Baseline Data (Days 1-7)

Run the bot with default parameters for at least 7 days.

{
  "engine": {
    "ewma": { "lambda": 0.94 },
    "prediction": {
      "logitMomentumWeight": 150,
      "logitReversionWeight": 80
    },
    "abstention": {
      "deadZone": 0.10,
      "minEV": 0.05,
      "minMargin": 0.15
    }
  }
}

Goal: Establish baseline Brier Score, accuracy, trade frequency, and EV.

Phase 2: Analyze Performance (Day 8)

Generate daily reports and identify weaknesses:

for day in 2026-03-01 2026-03-02 2026-03-03 2026-03-04 2026-03-05 2026-03-06 2026-03-07; do
  node src/reporter/daily.js $day
done

Key metrics:

Brier Score: Should be < 0.22 to justify trading
Early 1m accuracy: Target > 80%
Trade frequency: 40-60% of intervals (not too selective, not too loose)
EV per trade: Target > 0.05 (5%)

Phase 3: Tune One Dimension (Days 8-14)

Pick one parameter group to tune:

If Brier > 0.22
If accuracy < 75%
If trade frequency < 30%
If hitting red drawdown

Problem: Poor calibrationTune: Logit weights

Try reducing logitMomentumWeight to 120 (less aggressive)
Try reducing logitReversionWeight to 60 (less mean reversion)
Monitor Brier Score daily

Problem: Wrong predictionsTune: Abstention thresholds

Increase minMargin to 0.18
Increase deadZone to 0.12
Increase minEV to 0.07
Goal: Trade less, but be right more often

Problem: Over-filteringTune: Abstention thresholds (loosen)

Decrease deadZone to 0.08
Decrease minMargin to 0.12
Monitor if accuracy stays > 75%

Problem: Bet sizing too aggressive or poor edgeTune: Risk parameters

Reduce alpha in Brier tiers (e.g., 0.40 → 0.30 for tier 4)
Reduce maxBetPct to 0.03
Increase abstention thresholds

Phase 4: Validate (Days 15-21)

Run with tuned parameters for 7 more days. Do not look at results daily (avoid overfitting). After 7 days, compare:

Brier Score (should improve by 0.02-0.05)
Sharpe Ratio (should improve)
Drawdown (should be shallower)

If metrics degrade, revert to baseline.

Advanced: Multi-Parameter Optimization

Once you have 30+ days of data, consider grid search:

# Grid search over momentum/reversion weights
for momentum in 120 140 160 180; do
  for reversion in 60 80 100; do
    node src/backtest.js \
      --momentum-weight $momentum \
      --reversion-weight $reversion \
      --data data/history-2026-03.json \
      --output results-${momentum}-${reversion}.json
  done
done

# Rank by Sharpe Ratio
jq -s 'sort_by(.sharpe) | reverse | .[0:5]' results-*.json

Overfitting risk is real. Always validate on out-of-sample data. A configuration that performs 2% better on training data but 5% worse on validation data is not an improvement.

Quick Reference

Conservative (High Accuracy, Low Frequency)

{
  "engine": {
    "ewma": { "lambda": 0.96 },
    "prediction": {
      "logitMomentumWeight": 120,
      "logitReversionWeight": 80
    },
    "abstention": {
      "deadZone": 0.12,
      "minEV": 0.07,
      "minMargin": 0.18
    }
  },
  "risk": {
    "maxBetPct": 0.03,
    "brierTiers": [
      { "maxBrier": null, "minPredictions": 0, "maxPredictions": 150, "alpha": 0 },
      { "maxBrier": 0.22, "minPredictions": 150, "alpha": 0.15 },
      { "maxBrier": 0.18, "minPredictions": 150, "alpha": 0.25 }
    ]
  }
}

Aggressive (Higher Frequency, Moderate Accuracy)

{
  "engine": {
    "ewma": { "lambda": 0.92 },
    "prediction": {
      "logitMomentumWeight": 180,
      "logitReversionWeight": 70
    },
    "abstention": {
      "deadZone": 0.08,
      "minEV": 0.04,
      "minMargin": 0.12
    }
  },
  "risk": {
    "maxBetPct": 0.06,
    "brierTiers": [
      { "maxBrier": null, "minPredictions": 0, "maxPredictions": 100, "alpha": 0 },
      { "maxBrier": 0.26, "minPredictions": 100, "alpha": 0.15 },
      { "maxBrier": 0.22, "minPredictions": 100, "alpha": 0.30 },
      { "maxBrier": 0.18, "minPredictions": 100, "alpha": 0.50 }
    ]
  }
}

Get Started

Core Concepts

Configuration

Guides

​Tuning Philosophy

​EWMA Lambda (Volatility Estimation)

​What It Does

​Default Value

​Tuning Range

​Tuning Process

​Example

​Logit Weights (Momentum & Reversion)

​What They Do

​Default Values

​Parameter Interaction

​Tuning Range

​Tuning Process

​Recommended Starting Points

​Abstention Thresholds

​Critical Thresholds

​Tuning Process

​Example Analysis

​Risk Parameters

​Brier Tiers (Kelly Fraction)

​Tuning Alpha Values

​Max Bet Percentage

​Drawdown Thresholds

​Practical Tuning Workflow

​Phase 1: Collect Baseline Data (Days 1-7)

​Phase 2: Analyze Performance (Day 8)

​Phase 3: Tune One Dimension (Days 8-14)

​Phase 4: Validate (Days 15-21)

​Advanced: Multi-Parameter Optimization

​Quick Reference

​Conservative (High Accuracy, Low Frequency)

​Aggressive (Higher Frequency, Moderate Accuracy)

​Next Steps

Config Reference

Environment Setup

Build docs developers (and LLMs) love

Tuning Philosophy

EWMA Lambda (Volatility Estimation)

What It Does

Default Value

Tuning Range

Tuning Process

Example

Logit Weights (Momentum & Reversion)

What They Do

Default Values

Parameter Interaction

Tuning Range

Tuning Process

Recommended Starting Points

Abstention Thresholds

Critical Thresholds

Tuning Process

Example Analysis

Risk Parameters

Brier Tiers (Kelly Fraction)

Tuning Alpha Values

Max Bet Percentage

Drawdown Thresholds

Practical Tuning Workflow

Phase 1: Collect Baseline Data (Days 1-7)

Phase 2: Analyze Performance (Day 8)

Phase 3: Tune One Dimension (Days 8-14)

Phase 4: Validate (Days 15-21)

Advanced: Multi-Parameter Optimization

Quick Reference

Conservative (High Accuracy, Low Frequency)

Aggressive (Higher Frequency, Moderate Accuracy)

Next Steps