Dealing with Survivorship Bias

Chorlton

Established member
Messages
693
Likes
48
Hello All,

I have a question about Survivorship Bias.

Currently, I am developing a system that will trade the FTSE250 but am concerned about the effect of "Survivorship Bias" that may occur during the Backtesting phase.

Consequently, I would be interested in ideas that could reduce (or ideally eliminate) this concern.

To get the "ball-rolling", one idea I have is to open my universe up from the FTSE250 to the whole LSE Mrkt and, in addition, to add a liquidity filter to help be more selective and as such reduce the number of potential trades. The current problem I have with this approach is knowing what companies I should include in this bigger universe. If I simply use all the securities as supplied by my Data Provider then I get a large number of securities with very strange tickers codes, which I am personally not familiar with :confused:

Therefore, although I need to increase my universe, I would still prefer to use a predefined list with which I could create a separate watch list. Consequently, can anyone suggest where I may be able to get hold of a list of suitable securities? One obvious idea would maybe be to use the FTSE350 but I would prefer to also add other LSE related exchanges as well (eg AIM, etc).

All help/advice welcome.

In addition, if anyone has any thoughts on other ideas on how to reduce Survivorship Bias in this example, then I'd be most interested in those as well.

Kind Regards,

Chorlton
 
Last edited:
IMHO, depending on the nature of the system, survivorship bias can be a serious issue. I really don't know what you can do about it other than gradually accumulate your own database over the years. It highlights the crappy state of readily available financial data.

Re the FT250, I recall that the historical changes to the index were at one stage available on the FTSE web site for free. I can't remember the exact details or for how many years. Of course this is only a bit of the story as you still need price and/or fundamental data for tickers that have mysteriously vanished.

I was thinking about this issue today in the context of an era where individuals are expected to provide for their own retirement rather than exist on a woefully inadequate state pension. As a large proportion of such funds are presumed to come from stock market investment, the least the state could do is ensure a high quality source of financial data is universally and freely available. It ain't so hard - setup a non-profit organization to do the job, heavily regulate it and finance it with a tiny levy on publicly listed companies. [/rant]
 
IMHO, depending on the nature of the system, survivorship bias can be a serious issue. I really don't know what you can do about it other than gradually accumulate your own database over the years. It highlights the crappy state of readily available financial data.

Re the FT250, I recall that the historical changes to the index were at one stage available on the FTSE web site for free. I can't remember the exact details or for how many years. Of course this is only a bit of the story as you still need price and/or fundamental data for tickers that have mysteriously vanished.

I was thinking about this issue today in the context of an era where individuals are expected to provide for their own retirement rather than exist on a woefully inadequate state pension. As a large proportion of such funds are presumed to come from stock market investment, the least the state could do is ensure a high quality source of financial data is universally and freely available. It ain't so hard - setup a non-profit organization to do the job, heavily regulate it and finance it with a tiny levy on publicly listed companies. [/rant]

Hi DCraig,

Appreciate your comments. As you say, part of the problem is a lack of readily available good quality historical data and accumulating ones own database would definately help in the long-term.

However, in the immediate/short/medium-term, I need to find some other ways to work with what I have to help reduce this risk.

Hence, I felt my idea of opening up the universe of available securities would help although I need to find some way of identifying the list in advance, so that I can create the appropriate watch-list within my charting software.
 
Opening up the universe of securities might help. One problem is that all stocks are not equal and the ways that they are not equal differs over time.

A simple example is the way small caps outperform large over certain periods and the reverse is true over other periods. The significance of measures of fundmentals may vary from sector to sector or industry to industry. It's really not simple at all, and I don't subscribe to the one size fits all school of trading systems design - that a system should work on all securities.

I'm writing (or rather enhancing) my own stock screener and it certainly focuses the mind on some of these issues. In particular I have some cracking screens for both long and short that have been exceptionally effective over the last 12 months or so and it is very frustrating to not be able look at periods further back with a lot of confidence because of the scratchy data. This is very true of short screens, because the real laggards may simply have been snuffed out of existance, but quite possibly for many of the real growth stocks that may have been taken over, disappeared in mergers, got new tickers, been promoted to large cap indices or whatever.
 
Hello All,

I have a question about Survivorship Bias.

Currently, I am developing a system that will trade the FTSE250 but am concerned about the effect of "Survivorship Bias" that may occur during the Backtesting phase.

Consequently, I would be interested in ideas that could reduce (or ideally eliminate) this concern.

To get the "ball-rolling", one idea I have is to open my universe up from the FTSE250 to the whole LSE Mrkt and, in addition, to add a liquidity filter to help be more selective and as such reduce the number of potential trades. The current problem I have with this approach is knowing what companies I should include in this bigger universe. If I simply use all the securities as supplied by my Data Provider then I get a large number of securities with very strange tickers codes, which I am personally not familiar with :confused:

Therefore, although I need to increase my universe, I would still prefer to use a predefined list with which I could create a separate watch list. Consequently, can anyone suggest where I may be able to get hold of a list of suitable securities? One obvious idea would maybe be to use the FTSE350 but I would prefer to also add other LSE related exchanges as well (eg AIM, etc).

All help/advice welcome.

In addition, if anyone has any thoughts on other ideas on how to reduce Survivorship Bias in this example, then I'd be most interested in those as well.

Kind Regards,

Chorlton

Chorlton,

If you look at the 350 only, you're still looking at the top 350 companies and as such, the bias will still be relevant. The top 100 would probably bias this further, rather than improve things - afterall, there's only one way into this list.

A crude method could be to take the shares in issue, the change in shareprice over the periods in question, and calculate the market cap at the earlier time - then create your top 350 list in order of market cap. I'm sure this plan isn't bullet-proof but if you test that list and get a marked difference, alarm bells should ring. I've sent you some data to assist in this.

Cheers,
UTB
 
Sure. Use only forward testing.

Hello Bramble,

I do intend to forward test my strategy before committing real money to it but this would be the final step in the development process.

Therefore, any optimisation or tweaking of the system would (ideally) be performed on the in-sample data. By the end of this testing, I would require confidence in the system before exposing it to the out-sample data (through forward-testing).

However, at the moment, the issue of Survivorship Bias would affect my confidence in moving forward.
 
You don't have to have any exposure to forward testing. Same as back-testing, when you're convinced, in you go. Takes longer to get to lift-off, but it's certainly more effective than back-tetsing.
 
You don't have to have any exposure to forward testing. Same as back-testing, when you're convinced, in you go. Takes longer to get to lift-off, but it's certainly more effective than back-tetsing.

Bramble

I don't think you anyone can argue forward testing isn't more effective than backtesting. None the less, given about 20 billion trading ideas, you'd rather spend 6 months forward testing something that's worked in the past, than something that's failed miserably, no?

Cheers,
UTB
 
I don't think you anyone can argue forward testing isn't more effective than backtesting. None the less, given about 20 billion trading ideas, you'd rather spend 6 months forward testing something that's worked in the past, than something that's failed miserably, no?
Er, yes. If they were the only options.

However, my personal position is that back-testing provides greater confidence than is warranted. Testing a trading hypothesis forward (or even 20 billion trading hypotheses) provides far higher quality feedback than back-testing. And if that’s agreed, doesn’t higher quality presuppose a higher probability of success in finding those systems that do currently work? I guess that’s my point. Why spend time on an endeavour that is known to have a lower quality outcome?

I was addressing the ‘reducing Survivorship Bias’ query from Chorlton, not the process of data selection which is going to be an appropriate issue for discussion for either/both methods of testing.

The ONLY way to take Survivorship Bias into account is to forward test. You can’t with back-testing.
 
Er, yes. If they were the only options.

However, my personal position is that back-testing provides greater confidence than is warranted. Testing a trading hypothesis forward (or even 20 billion trading hypotheses) provides far higher quality feedback than back-testing. And if that’s agreed, doesn’t higher quality presuppose a higher probability of success in finding those systems that do currently work? I guess that’s my point. Why spend time on an endeavour that is known to have a lower quality outcome?

I was addressing the ‘reducing Survivorship Bias’ query from Chorlton, not the process of data selection which is going to be an appropriate issue for discussion for either/both methods of testing.

The ONLY way to take Survivorship Bias into account is to forward test. You can’t with back-testing.

ok, I'm lost. What method do you use to limit the 20 billion trading ideas? Unless I'm missing the point (quite possibly), it will take weeks / months / years to forward test an idea. Multiply a few (or 20 billion) ideas and that's an awful lot of work for potentially no reward (that's not a lazy option, because I'd agree that forward testing of some, if not 20 billion ideas, IS required)

It's probably a sign of my weakness, but I couldn't spend the time forward testing a few ideas unless I had some confidence that such ideas had worked (at some point) in the past.

Anyone who rely's purely on backtesting and isn't aware of its' limitations is heading for disaster. But that doesn't mean it doesn't have it's place.

Let's say I scrap my reliance ion backtesting - where do I start?

Cheers,
UTB
 
Last edited:
I was addressing the ‘reducing Survivorship Bias’ query from Chorlton, not the process of data selection which is going to be an appropriate issue for discussion for either/both methods of testing.

The ONLY way to take Survivorship Bias into account is to forward test. You can’t with back-testing.


well yes, you were - by seemingly by ruling out backtesting as a valid concept. If we are to follow your example, surely we must first debate then accept that backtesting isn't worth it?

UTB
 
It should be observed that there is backtesting and then there is backtesting. If one's notion of backtesting is to grab a time series and find out if one should buy when the super duper ultra smooth ma crosses the slower period even more super duper ma, then failure is almost guaranteed.

If one wanted to backtest to see if for example the stock selection criteria advocated by somebody like Jim Slater in "Beyond the Zulu Principle" provided an edge, then IMHO that could well be a worthwhile exercise.

Backtesting can be used to increase market understanding and improve ones capacity to respond better to different market conditions. If you want to forward test this sort of thing it will take a lifetime.
 
well yes, you were - by seemingly by ruling out backtesting as a valid concept. If we are to follow your example, surely we must first debate then accept that backtesting isn't worth it?
I have ruled out back-testing as a worthwhile endeavour for me, but that doesn't mean others have to. Nor that we need to debate it. That's already been done and of course, there will never be consensus. Thankfully.
 
ok, I'm lost. What method do you use to limit the 20 billion trading ideas? Unless I'm missing the point (quite possibly), it will take weeks / months / years to forward test an idea. Multiply a few (or 20 billion) ideas and that's an awful lot of work for potentially no reward (that's not a lazy option, because I'd agree that forward testing of some, if not 20 billion ideas, IS required)
I'm not surprised you're lost. Where did the 20 billion ideas come from? Chorlton has ONE idea and wants to identify sensible data sets to test it. Personally, I wouldn't want to limit 20 billion ideas - wouldn't want to have them in the first place either...

It's probably a sign of my weakness, but I couldn't spend the time forward testing a few ideas unless I had some confidence that such ideas had worked (at some point) in the past.
Well, it's horses for courses. Haven't you ever had an instinct and felt it worth following further rather than testing to see if it worked in the past? It may NOT have worked in the past but will now. You're only ever really interested in stuff that works now, not what did work then.

Anyone who rely's purely on backtesting and isn't aware of its' limitations is heading for disaster. But that doesn't mean it doesn't have it's place.

Let's say I scrap my reliance ion backtesting - where do I start?
With any idea that presents itelf to you that you would previously have set about back-testing. Don't. Simply forward test it. You'll find ideas drop away A LOT more quickly on a forward test than on a back test and you're not really going to get snowed under in any way as much as you imagine.
 
I'm not surprised you're lost. Where did the 20 billion ideas come from? Chorlton has ONE idea and wants to identify sensible data sets to test it. Personally, I wouldn't want to limit 20 billion ideas - wouldn't want to have them in the first place either....

I bet most people's ideas come from some sort of backtest - a statement like "stocks's that have broke new highs outperform those hitting new lows" is a backtest, isn't it? 20 billion ideas come from surfing the net (in my case:eek:), a few ideas come from weeding them down to ones that make sene, because in the past (where our intuition was developed) that type of thing had merit - backtesting I'd suggest.


Well, it's horses for courses. Haven't you ever had an instinct and felt it worth following further rather than testing to see if it worked in the past? It may NOT have worked in the past but will now. You're only ever really interested in stuff that works now, not what did work then.

yes, I'd put far more stall on recent and forward testing performance, but again I'm still left with too many ideas to forward test.

With any idea that presents itelf to you that you would previously have set about back-testing. Don't. Simply forward test it. You'll find ideas drop away A LOT more quickly on a forward test than on a back test and you're not really going to get snowed under in any way as much as you imagine.

you make a good point here. I'm always brought back to the same point - too many ideas and knowing where to start. Meanwhile I'm forward testing my current strategies all the time. But you're undoubtedly more advanced in your trading career and / or have a stronger intuition for what works. Either way I can't lose the bug.......just yet:)

Cheers,
Rich
 
Last edited:
Hello again,

Firstly, Thanks to all who have contributed to date.

After reading through the posts and the ongoing discussion, a thought has occurred to me. Consequently, I'd like to play devils advocate for one moment and ask the following question. Regarding my current situation:

HOW IMPORTANT IS THE ISSUE OF SURVIVORSHIP BIAS WHILE BACKTESTING?

Let me explain why I'm now questioning this:

Firstly:

In my strategy I have used the current FTSE250 list of stocks as my universe for developing my strategy. Based on these particular securities, I have a strategy, which I am potentially happy with. (ie. The R:R, Expectancy, etc are within acceptable limits as defined by myself)

The definition of Survivorship Bias (found through Google) is: The tendency for failed companies to be excluded from performance studies due to the fact that they no longer exist. Survivorship bias causes the results of some studies to skew higher because only companies which were successful enough to survive until the end of the period are included.

Now in my case, whether I use the current FTSE250 or the FTSE250 from the start of my backtesting period (in my case 1998) those stocks would have had to perform well enough to enable them to get into the FTSE250 in the first place and therefore at some point each of these stocks making up the FTSE250 would have been a well-performing stock.

Now the issue of Survivorship Bias arises when a stock suddenly becomes a weak / poor performer resulting in it being removed or delisted, which potentially, could affect the performance of my System.

However, if this were to happy then this would not happy overnight. Instead the stock would exhibit specific characteristics first, such as the SP entering a downward trend.

Now, as I have a “Weekly” System, I would be confident that any under-performing stocks entering such a downward trend (which had been previously identified by my system as a Buy) would be kicked out relatively quickly, as a result of my Exit conditions.

As a result, the weaker stocks would be naturally removed and the stronger performers would remain.

So surely the "key" question is what is the difference between a stock which was originally a Buy but then exited the system & then became delisted compared to one which simply met my Exit condition but remained within the FTSE250?

Well, if it was delisted it would be removed from the FTSE250 and could no longer be traded. However it would be replaced with another new stock entering the FTSE250, which could potentially offer the same trading opportunities as the previously one prior to being delisted. In contrast, a stock which simply met my Exit condition could be traded again (if Buy condition were met).

So whats the difference??? In my opinion nothing.

Therefore, the only way to really test for Survivorship Bias would be through Forward-Testing, which Bramble and others have already suggested. If my strategy is sufficiently "robust" then stocks which do become weak-performers in the future should be quickly removed while the better-performers are left to grow.

This in turn would suggest that the length of time that the Walk-Forward testing is carried out is important. From my limited experience with System Development, I would suggest that a lot of people would use around 75% of their available "historical" data to backtest an idea and the other 25% for forward-testing purposes (These % are only as rough guides). However, it would seem though that this ratio should be the other way around, or at least 50/50 ?

I'd be interested in others view on this. Apologies for the long post but I wanted to capture my current thought-process on this.

regards,

Chorlton
 
Take an index and add its constituents to your portfolio on the date they were added to the index. Remove them from your portfolio when they were removed from the index (or not). This removes survivorship bias, but you still need to find data for the ones that have been delisted. There are databases for purchase that do include delisted stocks. They are very expensive.

jj
 
With any idea that presents itelf to you that you would previously have set about back-testing. Don't. Simply forward test it. You'll find ideas drop away A LOT more quickly on a forward test than on a back test and you're not really going to get snowed under in any way as much as you imagine.

Hello Bramble,

Can you elaborate on this by maybe offering a real-life example as to how you would carry this out?

For example, would you use more recent "historical" data for the walk-forward testing and if so, what data period would you deem sufficient to provide a statistical acceptable level of confidence in a particular strategy idea?

In addition, what kind of test/s would you perform during this walk-forward testing period? Paper Trading the actual strategy, using Monte Carlo analysis, etc etc


Regards,

Chorlton
 
Hello again,

So surely the "key" question is what is the difference between a stock which was originally a Buy but then exited the system & then became delisted compared to one which simply met my Exit condition but remained within the FTSE250?

The difference is that in the past, there could have been more chance of coming across "weak" stocks, because the weak stocks have disappeared from your universe without a trace. You testing universe is the universe of survivors.

I think you are on the right track, though, in looking at the specifics of the system to get a handle on the significance of survivorship bias. It will surely affect different types of strategies to a greater or lesser extent.

As for how to partition your data set 75%/25% or 50%/50, I cannot see that there can be any hard and fast rules. If market conditions change, then no partitioning rule can save you. Probably better to test over bull market periods, bear market periods, high vola, low vola, thru market shocks such as 9/11 etc etc. Better to try and stress the system, than try to rely on data set partitioning to gain confidence.
 
Top