Iron Condor Backtest on ES Futures: +1539% in 20 Years — VRP, Greeks, and Honest Limits

Quant Trading

A full Iron Condor backtest on ES Futures from 2005 to 2026: how to reconstruct a volatility surface without options data, how to model the VRP, and what the five Greeks actually mean for a short-vol position. Includes five reinvestment scenarios with real metrics (CAGR, Sharpe 1.68, Sortino 1.46, max drawdown −9.4%), an honest breakdown of the model's limits (flat skew, fixed VRP), and a description of how the strategy runs in production with dynamic delta hedging, vega hedging, and charm monitoring.

Antoine Perrin Profile Picture

Antoine

CEO - CodeMarketLabs

2026-05-01

In 2003, a systematic options strategy was already quietly compounding on ES Futures. Twenty years later, the numbers tell a clear story: +1539% total return, a Sharpe ratio of 1.14, and a maximum drawdown of −28.4% — against the S&P 500's −57% during the GFC. This article walks through exactly how this Iron Condor backtest was built: the volatility surface reconstruction, the signal logic, the reinvestment scenarios, and — critically — where the model is too generous with itself.

What this article covers

  • What the Volatility Risk Premium (VRP) is and why IV exceeds RV more than 70% of the time on equity indices.
  • Which market regimes favor short-volatility strategies — and when to stay flat or delta-hedge.
  • The five Greeks that matter for an Iron Condor: delta, gamma, theta, vega, and charm.
  • Full backtest results on ES Futures (2005–2026): CAGR, Sharpe, Sortino, max drawdown across five portfolio configurations.
  • The honest limits of the model: flat skew, fixed VRP, no term structure — what would change with real options data.
  • How the strategy works in production: systematic entry signals, delta hedging via ES futures, vega hedging, and charm monitoring.

1. The Volatility Risk Premium: The Edge Behind the Strategy

The Volatility Risk Premium (VRP) is the difference between implied volatility and subsequently realized volatility. On equity indices, implied vol — the vol priced into options — has historically been higher than the vol that actually materialized. The VIX has averaged around 19–20% over the past 30 years, while the realized volatility of the S&P 500 has averaged closer to 15–16%. That gap of roughly 3 to 5 vol points is the VRP. It is the risk premium that option buyers pay for protection, and that option sellers collect in exchange for bearing tail risk.

The VRP is not constant. It compresses in calm regimes (2013–2017, late 2024) and explodes during stress — in March 2020, the VIX hit 85 while realized vol over the same 30 days was closer to 60%. In April 2025, the market rallied 7–10% in a single session — a move the implied vol was not pricing. That is exactly the kind of regime where this strategy loses. When the VRP is positive and stable, selling options collects premium consistently. When it collapses or inverts — as it can during fast, sharp dislocations — short volatility positions bleed.

On the chart below, green zones show periods where VRP was positive — the structural edge of the strategy. Red zones show inversions: the realized vol exploded past the implied vol, and any short-vol position was on the wrong side of the trade.

Graphique e la VRP historique sur le S&P500 futures
Graphique e la VRP historique sur le S&P500 futures

2. Market Regimes: When to Sell and When to Stay Flat

Not all market environments are favorable for selling volatility. Three regimes matter: range markets with low realized vol, where the Iron Condor collects theta without being touched; trending bull markets, where the strategy needs active delta hedging to stay directionally neutral; and crisis or vol-spike regimes, where selling should stop entirely — or positions should be closed.

A simple heuristic: use VIX percentile over the trailing 12 months. If VIX is above the 50th percentile, the premium is sufficient to justify opening positions. Below the 25th percentile, the credit collected is too thin relative to the gamma risk taken. In 2017, the VIX traded below 10 for extended stretches — a regime where the VRP was structurally compressed and most short-vol strategies had minimal edge.

Régimes de marchés S&P500 Futures
Régimes de marchés S&P500 Futures

3. The Greeks: Five Numbers That Define Your Risk

Selling an Iron Condor creates a specific risk profile that can be fully described by five Greeks. Understanding each one is not optional — it is the difference between managing the position and being surprised by it.

Delta (~0 at entry) measures the directional exposure. The goal of the Iron Condor is to stay delta-neutral. At entry, the short call spread and short put spread offset each other. Delta is the number to watch and rebalance continuously — via long or short ES futures — as the market moves. Gamma (negative) is the enemy. It measures how fast your delta moves when the underlying moves. Being short gamma means every move against you accelerates your losses. If gamma becomes too large, any significant move in the underlying generates outsized P&L damage. Theta (positive) is the reward. Each day that passes, the options you sold lose time value — and that decay flows into your account as profit. Theta is the reason the strategy works in range markets: time is on your side. Vega (negative) is the second major risk. If implied vol spikes after you sold options, the market-to-market value of your short position deteriorates. A VIX move from 15 to 30 in a single week can wipe out weeks of theta collection. Charm — the one Greek rarely discussed — measures how your delta drifts as time passes, even with no market movement. If you are short a call spread and the spot is close to your short strike, charm pushes your delta more negative every day. That directional drift is invisible on a static Greek snapshot but accumulates into a real risk as expiration approaches.

4. The Iron Condor Backtest: 20 Years on ES Futures

The backtest runs from January 2005 to April 2026 on ES Futures (E-mini S&P 500, $50 multiplier). Every third Friday, a new Iron Condor is opened: a short put spread and a short call spread placed one standard deviation away from the spot price. Wing width is 50 points per side. The position is closed when it reaches 100% of the credit received (take profit). Starting capital is $1,000,000. No stop-loss is applied in the base configuration.

Five portfolio scenarios are compared: the raw Condor cash equity, SPX buy-and-hold, the Condor with profits reinvested in SPX, the Condor with profits compounding at the risk-free rate (IRX), and a 50/50 split between SPX and the Condor.

Résultats Backtest du Iron Condor
Résultats Backtest du Iron Condor

The raw Condor cash delivers a Sharpe of 1.38 and a Sortino of 1.22 — exceptional risk-adjusted numbers — but a CAGR of only 9.4%. Compounding at the risk-free rate (Condor + IRX) improves the Sharpe to 1.68 and Sortino to 1.46, with a max drawdown of just −9.4%. The most compelling allocation for a practitioner is 50% SPX + Condor: CAGR 14.1%, Sharpe 1.14, Sortino 1.18, max drawdown −28.4%, total return +1539% — against the SPX's +496% with a −57% drawdown.

5. The Volatility Surface: How Options Are Priced in the Backtest

Backtesting an options strategy requires implied volatility data for every strike and every day. That data does not exist for free over 20 years. The solution used here is to reconstruct the surface synthetically: fit a GARCH(1,1) model with Student-t innovations on the daily return series to extract conditional volatility estimates, then generate an SVI (Stochastic Volatility Inspired) smile for each date using regime-dependent parameters. The GARCH model mean-reverts faster than a rolling window — after a spike, it decays toward the long-run mean in days rather than weeks, which makes it a better proxy for what the market would have implied. The Student-t distribution captures the fat tails of daily equity returns, producing better-calibrated vol estimates during the tail events that dominate options P&L.

The SVI parameterization maps log-moneyness to implied variance through five parameters: level (a), slope and curvature (b), correlation between spot and vol (ρ, negative for equity indices — the leverage effect), smile center shift (m), and smoothness (σ). In a crisis regime (GARCH vol > 40%), ρ is pushed to −0.99, the put wing steepens sharply, and the smile minimum shifts into OTM call territory — matching what was actually observed in March 2020 SPX surfaces. In calm regimes, all parameters relax toward a shallow, centered smile.

Skew Modelisation
Skew Modelisation

6. Honest Limits: Where the Backtest Is Too Favorable

Two limits are worth being explicit about. First, the volatility skew in the model is too flat. The SVI parameters are set by hand to match the qualitative shape of real surfaces, not calibrated on actual market data. In practice, put skew on ES is significantly steeper than what this model produces — especially in moderate-stress regimes. A steeper put wing means the put spread premium is higher but so is the cost of the long put leg. The net effect on credit collected is ambiguous, but the mark-to-market behavior in a selloff will be worse than the backtest suggests, because the short put will re-price faster than the model assumes.

Second, the VRP is set to zero. In reality, the VRP is positive on average — roughly 3 vol points over the long run. Setting it to zero is conservative: it means option prices in the backtest are lower than what the market would actually have offered. Adding 250 basis points of VRP would make the backtest look better but would also be wrong, because the VRP is variable and sometimes negative. The honest choice is to leave it at zero and acknowledge that real-world results will likely be better in calm regimes and worse in stress.

7. How the Strategy Works in Production

In production, the strategy runs on three additional layers that the backtest does not capture. First, a systematic entry signal: positions are only opened when implied vol exceeds realized vol by a meaningful margin, and when the market regime is classified as stable (no risk-off signal, no strong trend). Second, dynamic delta hedging: the delta of the book is monitored continuously and rebalanced via ES futures to stay near zero. When a call spread is being tested by the market, the put spread strikes are rolled up to collect additional credit and restore delta neutrality — and vice versa. Third, vega hedging: when implied vol drops sharply after entry, locking in a large portion of the VRP, the position can be partially hedged by buying VIX futures to prevent a sudden vol reversal from reversing the gains. The infrastructure runs on Python with the Interactive Brokers API, using a standalone Black-Scholes IV solver decoupled from the backtesting stack.

What the Notebook contains

  • Complete QuantSkewEngine class: GARCH fit, regime-dependent SVI surface generation, strike interpolation.
  • Full Iron Condor backtest engine: signal generation, option pricing, group-level TP/SL, equity curve output.
  • Five reinvestment scenarios with metrics: CAGR, Sharpe, Sortino, max drawdown, total return.
  • VRP and skew sensitivity analysis: run at vrp_bps = 0, 150, 300 and modified skew curvature.
  • Skew timeline visualization: ATM vol (log scale) + SVI smile at six key market dates.
Why use an Iron Condor rather than a short straddle?

A short straddle has unlimited loss potential on both sides. An Iron Condor caps the maximum loss at the wing width minus the credit received — in this backtest, 50 points × $50 multiplier per contract minus the premium. That defined risk is what makes position sizing tractable and prevents a single extreme move from being catastrophic. A straddle requires you to have a view on direction or to hedge delta perfectly in real time; an Iron Condor gives you a buffer before delta becomes a problem.

Why not sell single-stock options instead of index options?

On single stocks, the VRP is less structural. Individual names can gap 15–20% on earnings, FDA decisions, or unexpected news — moves that are not priced into the implied vol even if IV is elevated. On equity indices, idiosyncratic risks diversify away, and the structural demand for put protection from institutional investors creates a persistent, exploitable premium. The realized vol of the S&P 500 is much more predictable than any individual stock.

What happens when a strike gets tested?

If the underlying moves toward a short strike, the delta of the position becomes directional. The first response is to rebalance delta by selling or buying ES futures. If the short strike is breached or close to being breached, the opposite spread can be rolled toward the spot — collecting additional credit and partially restoring delta neutrality. In extreme cases, the position can evolve into a butterfly structure, which limits the maximum loss well below the theoretical maximum of the original condor.

How do you size positions?

A simple risk-based approach: target a maximum loss per position of 3% of total capital. With a wing width of 50 points and a $50 multiplier, the maximum loss per contract is $2,500 (before premium). 3% of $1,000,000 is $30,000, which gives 12 contracts. This sizing ensures that even a full loss on a single Iron Condor — which means both spreads being fully in the money at expiration — does not exceed 3% of the portfolio.

What is charm monitoring and why does it matter?

Charm is the rate of change of delta with respect to time. As expiration approaches, even with no movement in the underlying, the delta of an option drifts. For a short call spread close to the spot, charm pushes the delta more negative each day — meaning the position becomes increasingly short the market without any market move occurring. Monitoring the ratio of theta collected per unit of charm accumulated tells you whether the time decay you are earning is worth the directional drift you are accumulating. If charm becomes too large relative to theta, the position should be closed or hedged before expiration.

Is this strategy suitable for a retail account?

ES options require margin and are not available in all account types. The equivalent trade on a smaller scale would be SPX options (cash-settled, European exercise) or SPY options. The margin requirements, position sizing, and delta hedging mechanics are the same in principle. The main practical constraint is that dynamic delta hedging via futures requires a futures-enabled account and sufficient capital to make the hedge sizes meaningful.

Related articles

Long/Short Thumbnail

Long/Short Trading Explained by a Pro Trader: Profit Regardless of Market Direction

A complete breakdown of long/short equity strategies as used in institutional trading: mean reversion, relative value, divergence, beta hedging, and two Python backtests — JPMorgan vs Goldman Sachs and SPY vs FEZ. Includes IBKR execution and an honest assessment of the risks nobody talks about.

Quant Trading
thumbnail iron condor video

Iron Condor Backtest on ES Futures: +1539% in 20 Years — VRP, Greeks, and Honest Limits

A full Iron Condor backtest on ES Futures from 2005 to 2026: how to reconstruct a volatility surface without options data, how to model the VRP, and what the five Greeks actually mean for a short-vol position. Includes five reinvestment scenarios with real metrics (CAGR, Sharpe 1.68, Sortino 1.46, max drawdown −9.4%), an honest breakdown of the model's limits (flat skew, fixed VRP), and a description of how the strategy runs in production with dynamic delta hedging, vega hedging, and charm monitoring.

Quant Trading
Skew Modelisation

Reconstructing 20 Years of Volatility: GARCH, SVI and the VRP — A Practical Guide for Options Backtesting

How to reconstruct a complete historical volatility surface from scratch using only daily price data — no options data subscription required. We cover the Variance Risk Premium (VRP), GARCH(1,1) conditional volatility estimation with Student-t innovations, and the SVI parameterization for arbitrage-free volatility smiles. Includes regime-dependent skew calibration (crisis vs calm), the complete QuantSkewEngine Python class, and a skew timeline visualization across 20 years of market regimes.

Quant Trading
No more articles