What we got wrong

Most sites show you winners. We show you the failures first.

Every retired strategy. Every combo that failed walk-forward. Every backtest that looked great until we ran the honest version. Documented, dated, explained. Because honest failure reporting is the only way you can trust our wins.

200+

Strategies tested

195

Ruled out

11

Public post-mortems

0

Real capital bots (since 2026-04-28)

How our three research stages fit together

๐Ÿงช BotLab

Our experimental playground. Weird, creative, sometimes ridiculous strategy ideas โ€” backtested honestly. Most never left the lab.

๐Ÿค– /bots

The survivors. Strategies that passed every filter and now run live โ€” on real capital or paper-tracked daily.

๐Ÿ’€ Post-Mortems (here)

Bots we killed in production. They went live, disappointed, and got retired โ€” with date, root cause, and lesson. Different from the Lab: these actually made it to real execution.

Downgraded2026-04-28

All-Paper Policy: Every Bot Demoted From Real-Money to Paper-Tracking

Watchdog Tier 1 โ†’ 2. Adopted same evening as v2.1 multi-benchmark, before any audience could see the transition.

The claim

The Watchdog ran on a $6,000 real-money BTC allocation from 2026-04-12 to 2026-04-22, when it was paused due to a chart-structure concern at the BUY trigger ($78,880 at ascending-channel resistance). Six days later, the v2.1 multi-benchmark formal recompute showed Sharpe 0.35 + DD-ratio 1.38 vs S&P 500 โ€” all three v2.1 paths fail strict (Path 1 return-superior unverified, Path 2 risk-adjusted FAIL, Path 3 diversification sleeve FAIL with Sharpe below the 0.8 threshold). Same evening: adopted the all-paper policy. No bot graduates to real money until โ‰ฅ6 months of forward-validated proof. Watchdog demoted Tier 1 โ†’ Tier 2. Tier 1 now contains only Basis Sentinel (Path 3) and Alpha Hunter (Path 1).

Root cause

Two compounding factors. (1) Watchdog's 'Tier 1' classification was a v1-era artifact: it cleared the per-trade W/L test (3.44:1) but failed Walk-Forward (25%). It ran on real money under user discretion โ€” a historical decision, not a graduation under the validation framework. (2) The v2.1 multi-benchmark switch from BTC HODL โ†’ S&P 500 made the borderline obvious: against BTC HODL, Watchdog's drawdown protection was real (-34.7% vs HODL -77%); against S&P 500, Sharpe 0.35 and DD-ratio 1.38 mean it underperforms a passive index fund risk-adjusted. Calling it 'real-money-eligible' under v2.1 strict would have been special-pleading.

Lesson

Backtest validation tells you a strategy COULD work; it doesn't tell you it DOES work in the actual conditions of the next six months. BBR is two weeks old โ€” there is no forward track record for any bot. The honest move is to require โ‰ฅ6 months of live paper-tracking with maintained Sharpe + bounded MaxDD + forward beat-rate vs S&P โ‰ฅ50% before any real-money allocation. Real-Money Graduation Criteria are now codified on /methodology โ€” first eligibility window 2026-10-28. The framework now stands on its own: backtests choose candidates, forward results promote them.

Current status

All 10 bots paper-tracking with $10,000 virtual capital each (Watchdog historical $6,000). Watchdog demoted to Tier 2 in bot-validation.ts and bots/page.tsx. Methodology page rewritten with Tier 1 redefinition + new Real-Money Graduation Criteria section. Memory updated. The framework was tightened, not loosened โ€” Watchdog can earn Tier 1 back via forward-validated proof of Sharpe stability OR explicit S&P-uncorrelated crisis-alpha behavior.

See the v2.1 multi-benchmark framework + graduation criteria
Retired2026-04-27

The Contrarian: Retired After WF-on-Params Showed No Config Beats HODL

Validated under BOT_VALIDATION_STANDARDS 2026-04-27. Retired same day.

The claim

The Contrarian inverted textbook RSI on BTC: LONG when RSI(14) crossed above 70 (textbook says SELL), CASH when RSI crossed below 30. The bot card claimed +711% over 8 years vs HODL +473%. Validation suite revealed two structural problems: (1) the deployed config (P=14, 70/30) was the WORST point in BOTH the period sweep AND the threshold sweep โ€” adjacent (P=7, 75/25) would have returned +30,000% over 11.6y vs the deployed +2,122%. (2) When tested under WF-on-params with EVERY plausible config including the in-sample winner: NO config beat HODL OOS more than 38% of the time (target โ‰ฅ60%). Per-trade W/L was 4.92:1 (HIGHEST in entire BBR suite) but couldn't compensate for the directional underperformance.

Root cause

The contrarian thesis is REAL โ€” at every threshold pair tested, inverted RSI beats textbook RSI by hundreds of percentage points. But on BTC over 11.6 years, no specific calibration of the inverted strategy reliably beats HODL across rolling windows. Best-case: (P=5, 75/25) at 38% beat-rate, avg-excess โˆ’50pp. The strategy CONCEPT works (textbook RSI loses on BTC), but as a standalone bot it can't be calibrated to consistently outperform buy-and-hold. The +711% over 8 years was a window-cherry-picked artifact (replicated cleanly: +637% on 2018-2026); on the full available history (2014-2026), the same deployed config loses to HODL by โˆ’14,757pp.

Lesson

Methodological lesson added to BOT_VALIDATION_STANDARDS: when a bot fails Test 1 (Walk-Forward), don't just report the failure โ€” run a SECOND Walk-Forward over the parameter sweep itself. This separates 'calibration is wrong' (a fixable problem) from 'concept doesn't generalize' (a structural problem). The Contrarian was the latter. Re-calibrating to the in-sample winner would have been pure hindsight bias โ€” the in-sample best (P=7, 75/25) only marginally outperformed (35% vs 30% beat-rate, both FAIL).

Current status

REMOVED from /bots Hero grid 2026-04-27. Database row deleted from Supabase live_bots (Scout precedent). Validation code preserved at research/contrarian_validation/ for reproducibility. Original /blog/rsi-luege article (the article that introduced the bot) remains and accurately states what was tested.

The original RSI-Lies article
Failed Walk-Forward2026-04-17

Macro-Regime Overlay (DXY + SPX): Killed Tactician When Used as Filter

Tested in HF Alpha Hunt v5 on 2026-04-17

The claim

Hypothesis: BTC outperforms when US Dollar Index (DXY) is weak AND S&P 500 (SPX) is strong. Tested as standalone signal and as filter overlay on existing momentum bots. Standalone: only LB=90 passed 2/3 walk-forward, isolated peak (LB=20/30/60 all fail). As overlay on Tactician M-30: dropped Tactician's return from +2,010% to +97-575% โ€” strictly worse on every macro lookback tested.

Root cause

DXY+SPX filter kills too much BTC exposure during bull-market phases. Many of BTC's biggest up-moves happen in 'wrong' macro regimes (e.g., BTC rallied in 2020 while DXY also strengthened initially). The macro signal is too coarse and slow vs BTC's own cycle dynamics. Single-condition variants (DXY weak only, SPX strong only) also fail โ€” the combined filter is the problem.

Lesson

Equity-style macro overlays (DXY/SPX) don't generalize cleanly to crypto. BTC has its own cycle dynamics that dominate macro signals on multi-year horizons. Macro overlays may still work for risk management (reduce position size, not gate trades), but as binary trade-or-not filters they destroy more alpha than they save.

Current status

Not deployed. Existing bots run without macro overlays.

Failed Walk-Forward2026-04-17

GDX Standalone Momentum: Gold Miners Don't Trend Like Crypto Does

Tested in HF Alpha Hunt v5 on 2026-04-17

The claim

Hypothesis: Trend-following on Gold Miners ETF (GDX) generates alpha vs GDX HODL. Tested 4 lookbacks (M-20, M-30, M-60, M-90). All variants underperformed GDX HODL on full period. Walk-forward 0-1/3 windows.

Root cause

Gold and gold miners don't have the kind of persistent retail-driven momentum cycles that crypto has. Gold price is dominated by central-bank purchases, currency hedging flows, and macro-inflation expectations โ€” slower-moving institutional drivers. Trend-following requires predictable persistence; gold doesn't deliver it on the daily-momentum timeframe.

Lesson

Strategy classes that work brilliantly in one asset (BTC momentum) don't necessarily port to other assets (GDX momentum) even if the assets have similar volatility profiles. The market structure matters more than surface-level volatility.

Current status

GDX standalone NOT deployed. NOTE: GDX as PART of a rotation (BTC/GDX rotation = The Hedge Hopper) DID pass walk-forward โ€” see HF-ALPHA-HUNT-V5.md. The asset isn't dead; the standalone strategy is.

Retired2026-04-17

The Scout: Removed from Public Bot Lineup

Was kept as 'warning example' since 2026-04-15. Multi-metric review concluded it adds clutter, not signal.

The claim

The Scout (Hash Rate Momentum) was downgraded from PAPER to BACKTEST tier on 2026-04-15 when full-period analysis showed it loses to BTC HODL by ~1,572 percentage points over 6 years. We kept it visible on /bots with a BACKTEST badge as an 'educational warning example.' On reflection: the educational value is fully captured by its individual post-mortem article. Keeping the bot on the live grid implied 'active project' which it isn't.

Root cause

Multi-metric review: Scout fails on every single metric (negative Calmar, negative Sharpe, beats HODL in <50% of rolling windows, 1/3 walk-forward). The 'warning example' positioning was defensible but the ongoing Hero-grid presence created cognitive overhead for readers without adding insight beyond what the existing post-mortem article already delivers.

Lesson

Failed strategies belong in the post-mortem ledger, not in the active lineup with caveats. A clean Hero grid showing only currently-tracked bots reads more honestly than a mixed grid with status badges trying to explain why one entry is dead.

Current status

REMOVED from /bots Hero grid. Database row deleted. Original /blog/bot-der-scout article remains. Bot code preserved internally for historical reference.

The original Scout post-mortem
Failed Walk-Forward2026-04-17

MVRV Regime Switching: When the Cycle Indicator Triggers Too Late

Killed in HF Alpha Hunt v4 on 2026-04-17

The claim

MVRV (Market Value / Realized Value) is a well-known crypto cycle indicator. Tested 6 threshold combinations: SELL when MVRV > {2.5, 3.0, 3.5, 4.0}, BUY when MVRV < {1.0, 1.2, 1.5}. Returns ranged from +177% to +1,162% over 6 years vs HODL +901%. Maximum drawdowns -35% to -67%.

Root cause

MVRV thresholds trigger so rarely (~0.01 trades/week, basically once every 18-24 months) that by the time the signal fires, the cycle has often already turned. Walk-forward 1/3 windows max โ€” no robust regime detection. The single 'good' configuration (3.5/1.2 โ†’ +1,162%) is isolated; neighbors don't agree.

Lesson

MVRV may be useful as a regime overlay for OTHER strategies (e.g., reduce position size when MVRV > 3, regardless of momentum signal), but as a standalone strategy it's too coarse. The data is good; the threshold-based execution is wrong.

Current status

Not deployed. Filed under 'regime indicators useful in combination, not isolation.' MVRV data continues to flow through Andromeda's CoinMetrics sync for potential overlay use later.

Failed Walk-Forward2026-04-17

Pairs Trading BTC/ETH Ratio Mean-Reversion: Wrong Strategy Class

Killed in HF Alpha Hunt v4 on 2026-04-17

The claim

Tested z-score mean-reversion of the BTC/ETH price ratio across 9 parameter combinations (threshold 1-2ฯƒ, lookback 30-90 days). Logic: when BTC/ETH ratio is z-score extreme positive (BTC overpriced vs ETH), buy ETH; reverse for opposite. Classic equity-pairs setup applied to crypto.

Root cause

ALL 9 combinations failed walk-forward (0-1 of 3 windows). Drawdowns -88% to -92%. Returns -33% to +572% โ€” most below HODL by huge margins. The BTC/ETH ratio doesn't mean-revert in crypto's regime-shifting markets. ETH outperformance from DeFi summer doesn't snap back; BTC outperformance during bear flight-to-quality doesn't snap back. Pairs trading requires stable fundamental relationships; BTC and ETH have shifting narratives that defy mean-reversion.

Lesson

This is the OPPOSITE of what the Rotator does, and the Rotator works because momentum persists. Trying to fade the same signal creates an anti-edge. Strategy classes that work in equities (mean-reversion of correlated pairs) don't necessarily port to crypto.

Current status

Not deployed. Confirms that the Rotator's momentum-based approach is the right side of this trade.

The momentum-based BTC/ETH approach that DOES work
Failed Walk-Forward2026-04-17

Hash Rate Momentum + Regime Filter: Same Failure Mode as Original Scout

Killed in HF Alpha Hunt v4 on 2026-04-17

The claim

Attempted to revive the failed Hash Rate Momentum signal (original Scout's strategy) by adding a regime filter: only trade when BTC > N-day SMA. Tested 9 combinations (HR lookback ร— SMA lookback). One combo passed walk-forward 2/3, but it was an isolated peak with neighbors at 0/3. Even that 'winner' returned +388% vs HODL +544% โ€” losing to HODL.

Root cause

Same Sharpshooter signature: only HR-60d ร— SMA-150 passed 2/3 WF, while neighbors (30d/150, 90d/150) scored 0/3. Single isolated configuration, not a robust plateau. And it still loses to HODL on the full period. Hash Rate appears to be a confounded signal โ€” moves in concert with BTC price (miners scale to match revenue), so it offers minimal independent information for timing.

Lesson

A failed strategy with a regime overlay can occasionally LOOK like it works in 1 of 3 windows, but the 4-test stack (continuous + WF + robustness + benchmark) catches it. This confirms: Hash Rate is not an actionable trading signal, even with regime gating.

Current status

Not deployed. The original Scout's BACKTEST tier badge stays on /bots as the visible warning example for this strategy class.

The original Scout post-mortem
Failed Walk-Forward2026-04-17

Sharpshooter (Momentum-14d): The +2,053% Lottery Ticket

Passed 3/3 walk-forward but failed parameter robustness 2026-04-17

The claim

On 8 years of BTC daily data, Momentum-14d returned +2,053% (vs HODL +654%), beat HODL in all 3 walk-forward windows, and traded at 1.0 trades/week โ€” looked like a clean upgrade to The Tactician (Momentum-30d). Almost deployed.

Root cause

Parameter sweep showed M-14 was an isolated spike: M-13 returned +437% with 1/3 WF, M-15 returned +768% with 1/3 WF. The 14-day result was 3-5x better than its immediate neighbors, with no plateau supporting it. Classic signature of noise-overfitting โ€” one specific parameter value happened to align with random patterns in the historical data, no underlying market structure that 14 days uniquely captures.

Lesson

Walk-forward and parameter robustness are complementary tests. A strategy can pass 3/3 walk-forward by accidentally aligning with random patterns in each sub-period. The sweep test catches this. After running 17 lookback variants, finding one that beats every other test is statistically expected โ€” not proof of edge.

Current status

NOT deployed. The Tactician (M-30) stays as the BTC momentum paper bot. M-14 retired before it ever saw paper capital.

The full Sharpshooter post-mortem
Failed Walk-Forward2026-04-17

Options B and C: Watchdog + Tactician Combinations

Both combos failed walk-forward on 2026-04-17

The claim

We tested two combinations of the Watchdog (cycle filter) and Tactician (30-day momentum) bots. Option B: 50/50 independent capital split. Option C: Watchdog-gated (Tactician trades only when Watchdog=LONG). Continuous-run results looked excellent โ€” Option C returned +218% vs Watchdog-solo's +202% with lower drawdown. Looked like a strict upgrade.

Root cause

Split the data into 3 non-overlapping windows. Neither combo reliably beats Watchdog-solo. Option C: wins 1/3 windows on return (loses W2 by 7pp, W3 by 17pp). Option B: wins 1/3 windows on return AND 1/3 on drawdown. The +218% edge was W1-dominated โ€” a bear-market artifact that doesn't generalize to bull regimes.

Lesson

Continuous backtests can look great and still fail walk-forward. The 3-window check is the truth serum. Both combos provide marginal drawdown protection at meaningful return cost in most regimes. Neither is a clean upgrade over Watchdog-solo. Signal correlation (~70%) means the combos don't diversify enough to overcome the fees and tax friction they add.

Current status

NOT deployed. Real capital stays in Watchdog-solo. Tactician continues standalone on paper tier.

Why walk-forward is the only honest test
Downgraded2026-04-15

The Scout: Downgraded from LIVE to BACKTEST-only

Loses to HODL over the full 6-year period

The claim

Scout (Hash Rate Momentum) was originally promoted as a live paper-trading bot. Honest backtest over 6 years: +575% bot vs +2,147% HODL. Scout loses to buy-and-hold by a ~1,572 percentage-point margin over the full window.

Root cause

Scout performs well in specific regime slices (2022-2024) but terribly in others (2020-2022). Averaging hides the fact that it systematically underperforms during strong trending markets, which make up most of the BTC 6-year record.

Lesson

A strategy can look great in a cherry-picked window and still fail the honest benchmark. BACKTEST-only label keeps the bot visible on the /bots page as an educational artifact, not a recommendation.

Current status

On the /bots page with a BACKTEST tier badge. Kept public as a warning example.

The full Scout post-mortem
Failed Walk-Forward2026-04-14

Scout v2: Donchian Breakout Replacement

Beats HODL in 1 of 3 walk-forward windows

The claim

An attempt to rebuild the Scout around a different signal: Donchian 20-day breakout with 50-day SMA regime filter, 15% TP / 5% SL, 10-day time stop, positive-funding requirement. Continuous backtest 2020-2026: +156% over 71 trades. Looked clean.

Root cause

Walk-forward split the 6 years into 3 equal windows. Scout v2 lost to HODL by 346pp in W1 (biggest BTC bull in history), lost by 32pp in W2 (mid-cycle recovery), won by 10pp only in W3 (quiet consolidation where HODL itself was weak). Fixed TP/SL ratios too tight for BTC volatility โ€” the bot missed the asymmetric upside that makes HODL hard to beat.

Lesson

A continuous backtest can look presentable and still fail walk-forward. This is the third strategy in a week (after Options B and C of the Watchdog+Tactician combo) where the 3-window check killed a result that seemed promising on the full period. The Scout family โ€” mid-frequency trend-following on BTC daily โ€” appears structurally disadvantaged vs HODL in asymmetric markets.

Current status

Not promoted to /bots. Code preserved internally for research reference.

The full Scout v2 post-mortem
Retired2026-04-17

Alpha Hunter Light โ€” Retired the Same Day It Launched

Brand clarity over return-chasing

The claim

Inverse-vol-weighted Momentum Top 20 US stocks. Backtest +412% vs SPY +212%, -41% MaxDD (vs Alpha Hunter's -53%). Ran one paper rebalance with 20 positions. Looked promising.

Root cause

Same asset class (US large-cap stocks) and same signal family (momentum) as the existing Alpha Hunter. It would have diluted the bot lineup with a near-duplicate strategy that readers couldn't distinguish at a glance. The 'same alpha, less pain' pitch was real but marginal, and it violated the unique-angle filter we apply to new bots.

Lesson

Not every backtest-positive strategy deserves a production slot. Quality bar: unique asset class OR unique signal family. Alpha Hunter Light failed both tests.

Current status

Code, avatar and docs preserved at /111-Stockbot-Julia/livebot/. Reactivatable as an A/B variant.

Confirmed Failure2025-12-01

RSI Strategy โ€” A Documented Failure Kept in the Repo

Why we keep a bot that provably doesn't work

The claim

Classic RSI(14) oversold-overbought bot. Tested on 6 years of BTC daily data. 58 trades. 28% win rate. Lost money in every parameter configuration tested.

Root cause

RSI is a mean-reversion indicator. Bitcoin does not mean-revert on daily scale. It trends. Trend-persistence dominates mean reversion in the base rate of BTC price action. RSI buy-signals (oversold bounces) are wiped out by continuation.

Lesson

Popular doesn't equal working. An indicator can be ubiquitous in financial media and still fail a proper test. RSI survives because it looks right on cherry-picked charts. On the full history, it's negative-expectation.

Current status

Stays in the BotLab as a reference failure. Code is public.

The full RSI post-mortem

Why we publish failures

Survivorship bias ends here.

Every trading site on the internet shows you their winners. The bot that 10x'd. The strategy that "crushed the market." The backtest with the hockey-stick equity curve.

What you never see is the other 195 strategies they tested that went nowhere. The combinations that looked amazing in one window and collapsed in the next. The paper bots that quietly got deleted when the numbers turned ugly. That hidden graveyard IS survivorship bias โ€” and it's why the strategies they sell you probably won't work for you.

We do it the other way around. Failures get a page. Retirements get a date. Walk-forward disasters get a post-mortem. Because the signal value of a site you can trust comes from seeing the losses, not just the wins.

Our 12 live bots (1 on real capital, 11 paper-tracked) are what's left after running this filter. They're not guaranteed winners. They're the strategies that survived honest testing. And the day any of them starts to fail, they'll show up on this page too.