Data-Snooping: 3.2 Million Parameter Combinations. Your "Winner" Is Noise.

Let me do some math with you.

Your strategy has 5 parameters:

RSI period
Entry threshold
Exit threshold
Stop-loss distance
Max holding days

Each parameter you test at 20 possible values.

Total combinations: 20^5 = 3.2 million.

Somewhere in those 3.2 million combinations, there is a version that looks absolutely brilliant on your test data. Not because the strategy works. Because with 3.2 million random draws, at least one is going to fit any noise you feed it.

This is data-snooping. The most sophisticated form of overfitting.

What It Is

Data-snooping: exhaustively searching parameter space until you find the combination that makes the backtest look great, then presenting that combination as if it were discovered, not constructed.

The statistician's name for it is "multiple testing." The trader's name for it is "optimization." They're the same thing.

The Arithmetic That Should Scare You

If you test 1 random strategy, probability it looks significant by chance: ~5 percent.

Test 20: at least one random hit expected.

Test 3.2 million: you will find hundreds of "significant" random hits.

The hit you select out of those 3.2 million will look perfect. Max return, min drawdown, beautiful equity curve. And it will be noise.

Real strategies don't look perfect. They have messy periods. Real backtests show variance. Optimized backtests are too clean.

The Clean-Chart Warning Sign

When you see a backtest equity curve that goes smoothly up and to the right, barely any dips, few bad months — be suspicious.

Real strategies have:

10-30 percent drawdowns occasionally
Multi-month losing streaks
Entire years underperforming benchmarks
Messy trade-level variance

If the chart looks like a perfect line, it's been data-snooped. Every bump was tuned out.

My Close Call

Building the Momentum strategy for Sandra, I had multiple knobs:

Lookback: 3, 6, 9, 12 months
Skip: 0, 21, 42 days
Top N: 3, 5, 10, 15, 20, 30
Market cap floor: $1B, $5B, $10B, $25B, $50B
Trend filter: yes/no
Rebalance frequency: weekly, biweekly, monthly, quarterly

That's 4 × 3 × 6 × 5 × 2 × 4 = 2,880 combinations.

If I had tested all 2,880 and picked the best, I'd have been p-hacking / data-snooping. At least one would have shown +1000 percent.

What I did instead:

Fixed lookback at 12 months (from academic literature)
Fixed skip at 21 days (from academic literature)
Fixed rebalance monthly (from academic literature)
Tested only Top N (5, 10, 20) and MCap floor ($5B, $10B)

11 combinations total. All tested. All published. Not cherry-picked.

The winner: Top 10, $10B floor, 12M lookback, 21 skip, monthly rebalance. That's not an accident. It's theory-driven, with 3 degrees of freedom left for practical optimization.

The Honesty Test

Ask a trader: "How did you choose your parameters?"

Honest answer: "Academic literature" / "Theory" / "Intuition tested on small sample."

Snooping answer: "Tried different values, these worked best."

The snooping answer means: they optimized. Which means: their backtest is shinier than the reality.

The Paper That Should Change Your Mind

Cam Harvey, a Duke finance professor, co-wrote a paper called "...and the Cross-Section of Expected Returns" in 2016. They analyzed 316 "factors" published in academic finance.

Their conclusion: a large share of those factors are likely false. Using a stricter multiple-testing-corrected t-statistic (around 3.0 instead of the usual 2.0), most of the 316 candidate factors fail to clear the bar. They survived publication because thousands of researchers were testing thousands of ideas, and at the conventional 5% threshold, noise slips through.

Think about that. Published, peer-reviewed, citation-heavy academic finance. Majority noise.

Retail trading strategies, without peer review, are probably 99 percent noise.

The Right Way to Use Parameters

1. Set parameters from theory first. Why 12 months for momentum? Because Jegadeesh-Titman showed 3-12 months is the continuation window. Not because you tested 5, 10, 15 and 12 looked best.

2. Limit your knobs. Every parameter is a degree of freedom that invites data-snooping. Under 5 parameters is acceptable. Over 10 is almost always snooped.

3. Sensitivity analysis. Does the strategy work across a RANGE of similar parameters? If +/- 10% on each parameter keeps it working, it's robust. If only one exact combo works, it's snooped.

4. Out-of-sample verification. Design on 70 percent of data. Test on the 30 percent you haven't touched. If it breaks, it was snooped.

5. Publish the grid. If you tested 11 variants, publish all 11 performances. Not just the winner. This way readers can see if the winner was a single spike or a broad plateau.

The Momentum Grid I Published

From my Alpha Hunt article:

Strategy	Return	CAGR	MaxDD
Top 5, 12M, 10B	+830%	+36%	-66%
Top 10, 12M, 10B	+742%	+34%	-53%
Top 20, 12M, 5B	+573%	+30%	-52%
Top 5, 12M, 5B	+292%	+21%	-61%
Top 10, 12M, 5B	+420%	+25%	-63%
Top 10, 6M, 5B	+300%	+21%	-72%
Top 10, 3M, 5B	-39%	-7%	-90%
(+ with costs, + trend filters, etc.)

Notice: 10 of 11 beat SPY. That's a broad plateau, not a single spike. If I had published ONLY Top 5, 12M, $10B (+830%), I'd be snooping. Publishing the grid shows the strategy is robust to parameter choice. That makes it trustworthy.

The Top 10, 3M, 5B variant lost 39 percent. That's a failure mode — 3-month lookback is too short, catches noise. Publishing that shows the failure case too.

The Mental Shift

Data-snooping feels like "optimization." Make the strategy better.

It's actually "fitting to noise." Make the strategy more fragile.

The optimized version is the fragile version. More parameters = more fit to the past = less fit to the future.

Simpler strategies with fewer knobs generalize better. This is a deep truth in machine learning (Occam's razor, regularization, early stopping) and applies 1:1 to trading.

What to Do

When you see a strategy claim:

Check parameter count. Under 5 is OK.
Ask how parameters were chosen. "Theory" is good. "Tested a range" is suspect.
Demand the full grid, not just the winner.
Verify out-of-sample performance.

When you build a strategy:

Set parameters from theory first
Test a small range around theory values
Publish the full grid
Look for robustness, not peak performance

The winner of your grid should look slightly worse than the best point. Because the best point is random noise. The median point is signal.

That's the mindset that separates real edges from data-snooped illusions.

-> Previous: Recency Bias -> Back to pillar

Sources

Harvey, Liu, Zhu (2016) — "...and the Cross-Section of Expected Returns" — the 316 factors paper
Lo & MacKinlay (1990) — Data-Snooping Biases — classic reference
Occam's Razor in ML — the parallel in machine learning

Your Dominic, who tests 11 variants of ONE theory instead of 3.2 million random ideas.

Disclaimer: Not financial advice. Past performance does not guarantee future results.

Data-Snooping: 3.2 Million Parameter Combinations. Your "Winner" Is Noise.

What It Is

The Arithmetic That Should Scare You

The Clean-Chart Warning Sign

My Close Call

The Honesty Test

The Paper That Should Change Your Mind

The Right Way to Use Parameters

The Momentum Grid I Published

The Mental Shift

What to Do

Sources

Dominic Tschan

Bot Alerts & Trading Lies

More Articles

The Tactician: One Number, Every Day, Beats HODL by 486 Points

Recency Bias: Why Last Year's Winners Usually Lose Next Year

Regime Bias: When Your Strategy Works in Bulls and Only Bulls