Tested & Proven

Live ≠ Backtest: The Day My 'Perfect' Strategy Met the Real World

In 2019 I went live with a backtest that promised +840%. Six months later I had lost 18%. Here's why backtests miss what matters most — and how we deal with it.

DT
Dominic Tschan
April 3, 20268 min read
Live ≠ Backtest: The Day My 'Perfect' Strategy Met the Real World

In 2019 I built a backtest that looked like the holy grail.

Bitcoin daily data, six years. Simple rules: buy when X, sell when Y. The equity curve was beautiful — smooth, upward, forgiving drawdowns. +840% over six years, max drawdown -28%. I checked it three times. The math was clean. The logic was clean. The story was clean.

I went live on January 15, 2019 with $5,000.

Six months later, I opened my Bybit dashboard on a Tuesday morning. $4,103.

My bot had not broken. The signals had fired exactly as the backtest said they would. I had held my discipline, executed every trade, never overridden the system. And I had lost 18% in six months while the backtest said I should be sitting at around $7,200.

I stared at the screen for about an hour that morning. I remember the feeling specifically: not panic, not anger. Quiet confusion. The strategy hadn't failed. The backtest hadn't lied exactly. But somewhere between the spreadsheet and the exchange, almost $3,000 of "profit" had silently evaporated.

This article is about where it went.


The Six Things Backtests Don't See

The gap between backtest and live isn't one big leak. It's six small ones, compounding.

1. Slippage — the price you see isn't the price you get

Backtest: BTC closes at $50,000. The bot buys at $50,000. Done.

Reality, my 2019 log: price shown $50,000, order filled at $50,047. Next trade: shown $50,000, filled at $50,089. Twelve trades in, I notice my average fill price is running about 0.12% above what the chart shows.

That's a tiny number. It doesn't feel like anything. But run 60 trades a year at 0.12% average drag, double it for the exit slippage, and you've quietly paid 14% per year for the privilege of executing your backtest.

My bot was. So was I.

2. Outages — exchanges break

I had a sell signal on May 19, 2022 at 3:47 AM Swiss time. BTC was dropping fast — down 8% in 40 minutes. My bot tried to execute. Bybit's API returned a timeout.

It tried again. Timeout. And again. Timeout.

The API came back online 47 minutes later. Price had dropped another 6%. My sell finally filled — at a price the backtest never saw, because the backtest assumed instant fills on every signal.

Solana has had three significant full-network outages. Binance has had several rolling restarts during high-volatility windows. Coinbase melted down completely on multiple 2021 spike days. Backtest assumes you can always trade. Reality assumes you can't — usually at the worst moment.

3. Taxes — the boring killer

Switzerland has a clean rule with a fuzzy edge. If you trade like a private investor, capital gains are tax-free. If you trade like a professional, gains are taxed as income — up to 40-something percent depending on canton.

The line between "investor" and "professional" is deliberately vague. The tax office looks at five criteria: holding period, trade frequency, leverage use, whether you're financing trades with debt, and relationship to your main income. Cross the wrong thresholds and suddenly your 50-trades-per-year bot turns you into a professional trader in the eyes of the Steueramt.

My 2019 backtest showed +840% pre-tax. A real-life version in Switzerland, if classified professional, would have been +504% post-tax. That's over 300 percentage points that no backtest spreadsheet ever showed.

4. The future isn't the past

This is the one keeping me awake.

I tested my bot on 2013-2019 data. Bitcoin in that window had specific volatility structure — big swings, clear trends, retail-driven flows. The strategy fit that regime.

Then I went live in 2019. Within six months the market character changed: less trend-following, more range-bound, different liquidity profile. My strategy — perfectly calibrated to 2013-2019 Bitcoin — was now trading 2019-2020 Bitcoin. Same ticker. Different animal.

Backtest can only test one version of the world: the version that already happened. The future is a regime draw we haven't fully observed. No amount of backtest discipline fixes that.

5. You — the factor nobody models

Three months into my 2019 run, the bot was 22% down. I hadn't slept well in a week.

I opened my dashboard at 6 AM and found myself typing "should I pause the bot until things stabilize?" into a Google search. I almost did it. I talked myself down. A friend talked me further down. I didn't pause.

Two weeks later the bot had recovered half the drawdown. I told myself I'd made the right call.

Here's the part the backtest never modeled: I had become the strategy's biggest risk. If I had paused the bot that morning, I would have taken the drawdown AND missed the recovery AND probably not restarted until well after the bottom. A 22% drawdown would have become a 40% realized loss, purely because of one tired Tuesday morning.

Backtest assumed the bot ran untouched, no matter what. That's not a real assumption. You will, at some point, second-guess the bot. You will feel the drawdown in your chest and reach for the pause button. Your "automatic" strategy was always you-with-extra-steps.

This is why we keep the Watchdog — our most-tested bot, paper-tracking since 2026-04-28 under our all-paper policy — conservative enough that I can actually sleep through its drawdowns.

6. Disclosure decay — the transparency tax

Imagine a strategy that works because 400 retail traders don't know about it yet.

You publish the rules on a blog. Within a month, the 400 traders have read it. They build their own versions. Within three months, hundreds of small bots are firing on the same signal at the same time — and the edge that depended on being early-to-the-move is crowded out.

This is the specific tax I pay for publishing bot rules openly. Our Paper-tier bots have their full rule sets on their article pages — anyone can replicate them. I accept this as the cost of transparency, but I'm honest about what it costs: every transparent strategy is an edge on a slow countdown.

The Watchdog's exact thresholds stay private for exactly this reason. The strategy family is public; the specific DM+LD values that matter for execution are not.


The Gap, Visualized

This is what the gap looked like in my 2019 run. The smooth purple line below is what the backtest predicted month-by-month. The bumpy gold line is what actually hit my exchange account.

Same strategy. Same signals. Same fees (I thought). The divergence accumulates quietly — half a percent of slippage here, a missed execution there, a week of emotional hesitation that the backtest assumed never happened. Six months in, the gap is 18 percentage points of real money.


Backtest predicted vs. what actually landed on the exchange

My $5,000 live run, 2019. Same strategy, same signals. 18 percentage points of real money disappeared.

What Real Quant Funds Have Learned the Hard Way

This isn't retail naivety. Professional funds with Nobel laureates have lost billions to the same six factors.

LTCM, 1998. Long-Term Capital Management had two Nobel economics laureates on staff, $125 billion in notional positions, and 25 years of perfect convergence-trade backtests. Russia defaulted, correlations broke in ways no historical model had ever seen, and the fund collapsed in four months.

AQR factor funds, 2018-2020. Strategies with 30 years of academic backtest evidence sat in a multi-year drawdown. Cliff Asness's own post-mortem: factor correlations shifted in ways the historical record never showed.

The 2022 quant crypto wipeout. Several funds with strong 2019-2021 backtests blew up when cross-sector correlations changed during the FTX collapse.

The pattern is simple. Backtests describe what happened. Reality is what hasn't happened yet.


What We Do About It at BearBullRadar

We can't close the gap. We can manage it.

Step 1 — Backtest is the entrance exam, not the verdict. Every strategy has to pass our 3-test stack before it sees paper money: full-period return that beats just-holding the asset, passing walk-forward in 3 independent historical windows, plus robust results across nearby parameter values. About 95% of strategies fail at least one of the three.

Step 2 — Paper money for at least 45 days. Strategies that pass the entrance exam earn a paper slot. Real-time data, real signals, real Telegram alerts, real fee and timing simulation — virtual capital only. After 45 days we compare: did the live trades match what the backtest predicted?

Step 3 — 90 days minimum before real money. Even with live data matching backtest, real capital is a different conversation. We review broker selection, tax classification, capital sizing, and sign off manually. No bot graduates to real capital based on backtest alone.

This filter is slow. That's the point. Most retail traders skip Steps 2 and 3 entirely. The gap between their backtest and their actual portfolio is wide and quiet — mine certainly was in 2019.


The Experiment Running Right Now

Four momentum bots are running in parallel: Tactician 2.0, Rotator, Tri-Rotator, and Hedge Hopper. Backtest predicts a specific ranking among them. Over the next 45 days, live data will either confirm or contradict that ranking.

If the backtest is right, great — our methodology is calibrated. If a bot surprises us, that's the interesting case. We'll publish the divergence and trace it back to which of the six factors above explains the gap.

Either way, we learn. That's the whole point of Step 2.


What This Means for You

When you see a Bot Card on /bots, the backtest stats tell you what the historical data says. Sharpe, Calmar, walk-forward, beat-rate — all backtest-derived. They are predictions, not promises.

When a bot has been live for 90+ days, we'll show the comparison: what the backtest predicted vs what actually happened. If our methodology is sound, those numbers will roughly agree. If they don't, that's news, and we'll publish it.

For strategies you're personally evaluating — ours, your own, anybody's — demand all three: backtest evidence, live performance, and honest disclosure of the gap when they differ.

A pretty backtest without live data is research. A pretty live record without backtest disclosure is luck. Both together, with the gap honestly reported, is signal.

Disclaimer: This is not financial advice. All backtests are based on historical data and do not guarantee future results. Only invest what you can afford to lose.

Dominic Tschan

Dominic Tschan

MSc Physics, ETH ZurichPhysics teacher · Crypto investor · Bot builder

ETH physicist who tested 200+ trading strategies on 6 years of real market data. Runs 12 tier-labeled bots. 1 on real capital, 11 paper-tracked. Here I share everything: results, mistakes, and lessons.

Free Forever · No Credit Card

TheBot-Letter

Get the same signals our 10 live trading bots send our internal team — before they show up on the website.

  • The 17 Biases That Wreck Your Trading
    38 pages · 10 backtest traps + 6 behavioral biases + the meta-bias, backed with 6 years of real BTC data · arrives instantly
  • Bot signal access
    Every BUY/SELL across all 12 live bots · in your inbox the moment it happens
  • Weekly performance review
    Equity curves · what worked · what didn't · zero marketing spin
Free PDF · 38 pages
The 17 Biases
Every trap that wrecks your trading — plus a checklist to catch yourself before the damage.
Part I: 10 backtest traps
Part II: The meta-bias
Part III: 6 behavioral biases
Part IV: Bias-resistance checklist
bearbullradar.com
Latest Bot Signal
just now
🏄 The Surfer · GRID mode active · BTC $75k · range $64k–$87k
Subscribers got this one 47 minutes before it appeared on the website.
10 live bots tracked
71 published reviews
Zero affiliate links
Unsubscribe anytime · No spam · GDPR-safe