Dealing with Survivorship Bias

As for how to partition your data set 75%/25% or 50%/50, I cannot see that there can be any hard and fast rules. If market conditions change, then no partitioning rule can save you. Probably better to test over bull market periods, bear market periods, high vola, low vola, thru market shocks such as 9/11 etc etc. Better to try and stress the system, than try to rely on data set partitioning to gain confidence.

Hello DCraig,

Totally agree with your comments !! With my particular system, my backtesting has been completed on data from 1998 to 2004 (inc). The reasoning for this was during this period, the FTSE experienced a Ranging Mrkt, A Bear Mrkt and a Bull Mrkt, all of around the same amount of time (ie. 2 years in each phase).

Consequently, my reasoning is that if my System performs equally well over the entire testing period (which captures all these different phases) then the overall strategy should be robust enough to deal with future Mrkt corrections.

What would be your view on this? Does this sound like a reasonable approach, or am I missing something important?

Thanks in advance,

Chorlton
 
Hello DCraig,

Totally agree with your comments !! With my particular system, my backtesting has been completed on data from 1998 to 2004 (inc). The reasoning for this was during this period, the FTSE experienced a Ranging Mrkt, A Bear Mrkt and a Bull Mrkt, all of around the same amount of time (ie. 2 years in each phase).

Consequently, my reasoning is that if my System performs equally well over the entire testing period (which captures all these different phases) then the overall strategy should be robust enough to deal with future Mrkt corrections.

What would be your view on this? Does this sound like a reasonable approach, or am I missing something important?

Thanks in advance,

Chorlton

Sounds resonable to me.

One other thing that might (or might not) be useful is to look at the trades (ie the stocks), see what industries they were in, what was driving them at the time etc etc. It might give a bit more insight.
 
Can you elaborate on this by maybe offering a real-life example as to how you would carry this out?

For example, would you use more recent "historical" data for the walk-forward testing and if so, what data period would you deem sufficient to provide a statistical acceptable level of confidence in a particular strategy idea?
I wouldn’t use ANY historical data. Let’s take a totally hypothetical scenario involving indicators (shudder). I have an idea that when the MACD is above the zero line and the histogram a gap appears between the histogram and the MA line (clear air turbulence) AND the Stochastic is above the 80 level and turning down, it’s a good time to take a Short with a tight stop. Reverse all of the above for a Long.

I pull up a chart in whatever TF I think this jiggery pokery is going to work and – I watch it. Over a random sample of the instrument set I imagine it might work for. If I get more positive occurrences than negative, I carry on watching. I keep a tally of =ve to –ve occurences. If I’m of the view it’s a goer after N occurrences, I’ll demo trade it for a while. If that seems to pull a profit – I’ll trade it.

The ‘for a while’, ‘N occurences’ will depend on you and TF involved. Anything on a 5 min TF will give you more opportunities for data collection than a Daily over any given parcel of absolute time.

Which brings us to where you are I believe Chorlton. I get the sense you’re working with a relatively long TF and possibly even on Fundamental rather than Technical aspects. If I were in that position, my approach would be just the same as above, it would just take longer. And the Survivorship Bias would be adequately handled by absorbing precisely the number of losers that occur in reality as winners.

In addition, what kind of test/s would you perform during this walk-forward testing period? Paper Trading the actual strategy, using Monte Carlo analysis, etc etc
None. Just trade execution characteristics and profit potential.
 
I wouldn’t use ANY historical data. Let’s take a totally hypothetical scenario involving indicators (shudder). I have an idea that when the MACD is above the zero line and the histogram a gap appears between the histogram and the MA line (clear air turbulence) AND the Stochastic is above the 80 level and turning down, it’s a good time to take a Short with a tight stop. Reverse all of the above for a Long.

I pull up a chart in whatever TF I think this jiggery pokery is going to work and – I watch it. Over a random sample of the instrument set I imagine it might work for. If I get more positive occurrences than negative, I carry on watching. I keep a tally of =ve to –ve occurences. If I’m of the view it’s a goer after N occurrences, I’ll demo trade it for a while. If that seems to pull a profit – I’ll trade it.

The ‘for a while’, ‘N occurences’ will depend on you and TF involved. Anything on a 5 min TF will give you more opportunities for data collection than a Daily over any given parcel of absolute time.

Which brings us to where you are I believe Chorlton. I get the sense you’re working with a relatively long TF and possibly even on Fundamental rather than Technical aspects. If I were in that position, my approach would be just the same as above, it would just take longer. And the Survivorship Bias would be adequately handled by absorbing precisely the number of losers that occur in reality as winners.

None. Just trade execution characteristics and profit potential.

Hello Bramble,

Appreciate your comments.

I'm working with a Weekly TF so although I can understand your approach, I don't believe I would be comfortable with it due to the length of time that I would have to monitor the trades for, so as to arrive at a "statistical" viable amount of data to confirm whether the strategy had an "edge".

Instead, I would require some form of confidence in the strategy beforehand so as to limit the amount of "real-time" monitoring I would have to do. Which I guess leads me back to backtesting & Forwardtesting using Historical data. :confused:

Btw: My System does not use indicators and is 100% mechanical with no reference to Fundamentals.....

Regards,

Chorlton
 
Chorlton,

If you look at the 350 only, you're still looking at the top 350 companies and as such, the bias will still be relevant. The top 100 would probably bias this further, rather than improve things - afterall, there's only one way into this list.

A crude method could be to take the shares in issue, the change in shareprice over the periods in question, and calculate the market cap at the earlier time - then create your top 350 list in order of market cap. I'm sure this plan isn't bullet-proof but if you test that list and get a marked difference, alarm bells should ring. I've sent you some data to assist in this.

Cheers,
UTB


Thanks Blades,

This is an approach I haven't yet considered.....

If I could find a way of finding a list which contains this information (rather than having to calculate each one by hand) that would be v. helpful. ;)
 
I wouldn’t use ANY historical data. Let’s take a totally hypothetical scenario involving indicators (shudder). I have an idea that when the MACD is above the zero line and the histogram a gap appears between the histogram and the MA line (clear air turbulence) AND the Stochastic is above the 80 level and turning down, it’s a good time to take a Short with a tight stop. Reverse all of the above for a Long.

I pull up a chart in whatever TF I think this jiggery pokery is going to work and – I watch it. Over a random sample of the instrument set I imagine it might work for. If I get more positive occurrences than negative, I carry on watching. I keep a tally of =ve to –ve occurences. If I’m of the view it’s a goer after N occurrences, I’ll demo trade it for a while. If that seems to pull a profit – I’ll trade it.

The ‘for a while’, ‘N occurences’ will depend on you and TF involved. Anything on a 5 min TF will give you more opportunities for data collection than a Daily over any given parcel of absolute time.

I completely fail to see any distinction between this sort of "forward test" and using historical data. By the time you have drawn some empirically supported conclusion, the data has become historical data anyway.
 
..................... With my particular system, my backtesting has been completed on data from 1998 to 2004 (inc). The reasoning for this was during this period, the FTSE experienced a Ranging Mrkt, A Bear Mrkt and a Bull Mrkt, all of around the same amount of time (ie. 2 years in each phase)......................

Chorlton,

I'm a bit late entering into this debate, but I'm not quite clear what you're up to :confused:

From what you've said in various posts I take it that you're trading constituent stocks of the FTSE250? If that's right then, to avoid survivorship bias, your 1998 - 2004 backtesting would need to have taken account of every constituent change during that period (by revising the constituent list to ensure that your entry set-up criteria threw up all stocks that were in the 250 at the time). If you've just taken the current (or at the back end of 2004) constituent list then survivorship bias would be an issue and more pronounced if your strategy is associated with down trend reversal since those that didn't reverse may have just gone down and out of the 250.

So far as the backtest and forwardtest debate is concerned, I think both have their place. The biggest difficulty is the impact of "market shocks". One could say that any news driven movement is a "market shock" to a greater or lesser degree, but I'm thinking of the severe ones (each can define "severe" for themselves ;)). No-one would doubt the severe market shock of 9/11, for example, where inclusion would give a rosy glow to any backtest whether long or short (longs would have been unlikely to get out at the assumed stoploss level). I might have winked, but defining "severe" and what to include or exclude in the overall test results is not an easy matter.

good trading

jon
 
I completely fail to see any distinction between this sort of "forward test" and using historical data. By the time you have drawn some empirically supported conclusion, the data has become historical data anyway.
There is a tendency to curve fit historical data. To see relationships that don’t exist. The brain has a bias to filter out those data which do not confirm one’s hypothesis and to focus on those that do, often giving an impression of greater reliability than is physically inherent within the reality.

When you work forwards with data and your hypothesis, always on the right hand edge, there is less of a tendency to discount the data which are contraindications of your hypothesis.

Yes, of course, after N time periods from starting your forward test you have effectively got a bunch of historical data, but it has been reviewed in RT rather than retrospectively and can in principle be safely discarded.

The information delivery is quite different in these two processes.
 
I wouldn’t use ANY historical data. Let’s take a totally hypothetical scenario involving indicators (shudder). I have an idea that when the MACD is above the zero line and the histogram a gap appears between the histogram and the MA line (clear air turbulence) AND the Stochastic is above the 80 level and turning down, it’s a good time to take a Short with a tight stop. Reverse all of the above for a Long.

I pull up a chart in whatever TF I think this jiggery pokery is going to work and – I watch it. Over a random sample of the instrument set I imagine it might work for. If I get more positive occurrences than negative, I carry on watching. I keep a tally of =ve to –ve occurences. If I’m of the view it’s a goer after N occurrences, I’ll demo trade it for a while. If that seems to pull a profit – I’ll trade it.

The ‘for a while’, ‘N occurences’ will depend on you and TF involved. Anything on a 5 min TF will give you more opportunities for data collection than a Daily over any given parcel of absolute time.

Which brings us to where you are I believe Chorlton. I get the sense you’re working with a relatively long TF and possibly even on Fundamental rather than Technical aspects. If I were in that position, my approach would be just the same as above, it would just take longer. And the Survivorship Bias would be adequately handled by absorbing precisely the number of losers that occur in reality as winners.

None. Just trade execution characteristics and profit potential.

at the risk of going around in circles........you're describing a procedure that i think we'd all follow. This bit isn't for debate, I think.

You open up by saying (to paraphrase) "I have an Idea that X might work".....now your intuition /skill /exerience (call it what you like) might be powerful enough to be accurate enough to avoid lots of blind alleys. The backtesting comes in (for others) because if this were to be a viable methods- surely it will have worked at least recently in the past? Certainly if it hadn't, I wouldn't have the inclination to forward test it.

UTB
 
There is a tendency to curve fit historical data. To see relationships that don’t exist. The brain has a bias to filter out those data which do not confirm one’s hypothesis and to focus on those that do, often giving an impression of greater reliability than is physically inherent within the reality.

Quite agree about curve fitting, and the tendency to just "not see" that which contradicts ones hypothesis. Which is why so much stuff where a few charts are shown to support some claim to be able to forecast future behavior is just so much garbage.

Backtesting requires real discipline and market knowledge to do properly. An example of the worst of the worst was the Motley Fools variation on the Dogs of the Dow, which rejected the second highest yielding stock because that gave better historical results and for no other reason.

It is worth observing that what value there is in undirected data mining is likely to shrink rather rapidly given the computing power now available to cover vast search spaces in a very reasonable period of time. If I could think of a good use for it, I could put together a Linux compute cluster in my garage without breaking the bank. If I can, then "anybody" can.
 
There is a tendency to curve fit historical data. To see relationships that don’t exist. The brain has a bias to filter out those data which do not confirm one’s hypothesis and to focus on those that do, often giving an impression of greater reliability than is physically inherent within the reality.

When you work forwards with data and your hypothesis, always on the right hand edge, there is less of a tendency to discount the data which are contraindications of your hypothesis.

Yes, of course, after N time periods from starting your forward test you have effectively got a bunch of historical data, but it has been reviewed in RT rather than retrospectively and can in principle be safely discarded.

The information delivery is quite different in these two processes.

I completely agree with this bit. I have zero doubt that I could come up with 10 "profitable" looking strategies by the end of today by back-testing alone. I have next to zero confidence that they'd work out as tested. That said, I could do further work such as re-optimising / walk forwards tests on historical data and monte-carlo analysis to increase that confidence. Again, I agree on the limitations of backtesting alone.

UTB
 
if this were to be a viable methods- surely it will have worked at least recently in the past?
Why would you choose to assume that?

When you get an intuition/instinct whatever for a possible edge, there is absolutely no requirement for it to have proved itself in the past, recent or distant, for it to be worth considering going forward.

You may be right about going round in circles and I suspect we're muddying Chrolton's greater purpose here, so I'm happy to continue the debate if there's mileage in it for anyone (after all, it's just my view and the way I work which may not work for anyone else), but perhaps on a new thread?
 
I completely agree with this bit. I have zero doubt that I could come up with 10 "profitable" looking strategies by the end of today by back-testing alone. I have next to zero confidence that they'd work out as tested. That said, I could do further work such as re-optimising / walk forwards tests on historical data and monte-carlo analysis to increase that confidence. Again, I agree on the limitations of backtesting alone.

UTB

Whatever formal tests one may apply either on in sample or out of sample results, the common sense test is pretty important - is there is reason that can be ascribed to the puported market chacteristic that is to be exploited ? "Causality" is important.
 
Why would you choose to assume that?

When you get an intuition/instinct whatever for a possible edge, there is absolutely no requirement for it to have proved itself in the past, recent or distant, for it to be worth considering going forward.

You may be right about going round in circles and I suspect we're muddying Chrolton's greater purpose here, so I'm happy to continue the debate if there's mileage in it for anyone (after all, it's just my view and the way I work which may not work for anyone else), but perhaps on a new thread?

yes, we should continue elsewhere. Final point (here) - I must be missing the point, but I can't comprehend that i'd have a brainwave of an Idea (in the way you suggest) that would just happen to start working at that point.:(

I'll leave it here for the reasons you suggest.

dcraig1 - I can't add to your rep but your point about the common sense test and causality are spot on.

UTB
 
Gents,

Just a quick "Thank You" to all those who contributed on this thread. Although, we began to move away from the initial question near the end, the overall discussion raised some interesting points, worth considering in the future !!

Cheers & Happy Trading....

Chorlton
 
Hello Guys,

If I may I'd just like to ask one further question.

As already discussed, I have carried out my system testing / development using a predefined list of stocks as my universe which just happens to be the current constituents of the FTSE250. Whether this was a good or bad approach from a Survivorship Bias point of view has already been discussed.

However, my question is when I do turn the system on and begin "live" trading , which of the following approaches should I adopt?

1. Continue to only use these particular stocks to trade with. Afterall, they are the exact ones with which the testing / development of the system was based upon. If they disappear from the current FTSE250 I will still continue to trade them, regardless if they move up into the FTSE100 or downwards.

2. Trade only the current FTSE250 list AND keep updating the list as old ones drop out and new stocks appear.


Just interested in others views....

Thanks in advance,

Chorlton
 
Top