I was hoping that someone could perhaps shed some light on significant discrepancies in EOD historic data obtained from various sources for FTSE100 and FTSE250 shares. The data sources were Sharescope, ADVFN, Yahoo, and Paritech.
The discrepancies are significant, e.g 16 pence difference in highs on a particular day for a 576p share, 15p difference in opening price for a 620p share etc. They do not seem to be related solely to different methods of assessing Open and Close prices, as High and Low discrepancies occur on days where neither is equal to the Open or Close.
I first noticed the anomalies when modelling a trading strategy using two different data sources, and have now done some random sampling to assess the problem. Of the four sources, only two seem close in data terms, Sharescope and ADVFN. Even then, they show some variation.
I had presumed, obviously wrongly, that all the data was ultimately sourced from the LSE, where there would be standardised definitions of Open, High, Low, and Close. As such, there would be no scope for subjective decisions on whether, for instance, High was the highest mid-point price or instead the highest Ask price during the day. This is obviously not the case judging by the data variability.
I would be interested in any views on this, particularly on which is the most reliable data source in terms of reflecting what would have happened in the real market, if for example CFDs were being traded.
Thanks
The discrepancies are significant, e.g 16 pence difference in highs on a particular day for a 576p share, 15p difference in opening price for a 620p share etc. They do not seem to be related solely to different methods of assessing Open and Close prices, as High and Low discrepancies occur on days where neither is equal to the Open or Close.
I first noticed the anomalies when modelling a trading strategy using two different data sources, and have now done some random sampling to assess the problem. Of the four sources, only two seem close in data terms, Sharescope and ADVFN. Even then, they show some variation.
I had presumed, obviously wrongly, that all the data was ultimately sourced from the LSE, where there would be standardised definitions of Open, High, Low, and Close. As such, there would be no scope for subjective decisions on whether, for instance, High was the highest mid-point price or instead the highest Ask price during the day. This is obviously not the case judging by the data variability.
I would be interested in any views on this, particularly on which is the most reliable data source in terms of reflecting what would have happened in the real market, if for example CFDs were being traded.
Thanks