Will the real volume please stand up!

Stevoswing

Junior member
Messages
34
Likes
0
I hope someone might be able to help me.

I noticed quite a long time ago that volume data for stock indices quoted by different online providers were often not the same. In a week in June I did a sample audit for FTSE 100, Nasdaq 100, Dow Jones Industrial Average and S&P 500. I tracked the volume data from a selection of available sources (all free except Updata):

FTSE 100: Updata/Comstock (UKX), Yahoo (^FTSE), FT.com (FTSE:FSI), Bloomberg (UKX:IND)
Nasdaq 100: Updata/Comstock (IXND), Yahoo (^NDX), Bloomberg (INDU:IND)
DJIA: Updata/Comstock (INDU), Bigcharts (INDU or DJIA), Yahoo (^DJI), FT.com (DJI:DJI), Google (INDEXDJX:DJI), Bloomberg (UKX:IND)
S&P 500: Updata/Comstock (S500), Bigcharts (SP500), Yahoo (^GSPC), FT.com (SPX:WCB), Google (INDEXSP:INX), Bloomberg (SPX:IND)

I found the following:

FTSE 100Yahoo & FT identical, Bloomberg 4% lower, Comstock 7% lower

Nasdaq 100
Comstock data is intermittent (e.g. none since 20th June), so there is nothing to compare Yahoo with

DJIA
Bigcharts and FT identical, Bloomberg virtually the same, Google 0.2% lower, Comstock 0.8% lower with same value as S&P 500 (when available via Comstock) but Yahoo 28 times higher with same value as quoted for S&P 500!

S&P 500
FT and Google within 4%, Bloomberg 75% lower and Yahoo 55% higher, Comstock data is intermittent (e.g. none since 20th June),

In summary, FTSE 100 volume data are reasonably consistent, Nasdaq 100 volume data are sparse, DJIA has a cluster of consistent data but also some completely inconsistent data, and S&P 500 volume data are not at all consistent.

All of which does not inspire any confidence in the use of volume analysis to support price analysis. Unfortunately, the one source I use for analysis (and pay for) is Updata whose data provider is Comstock and yet their volume data is intermittent and when it does arrive for Dow and S&P it often quotes the same value!

Which leads to a couple of questions someone may be able to help me with.

1) Why do different sources quote different volume figures (even though they might be within a few percent)?

2) Where is the true volume data?

Thanks in advance for any assistance.

Steve
 
I
1) Why do different sources quote different volume figures (even though they might be within a few percent)?

2) Where is the true volume data?

1. Why do different sources quote different volume figures (even though they might be within a few percent)?

In the case of US Equity index volumes, US equities are traded over multiple venues (e.g. NYSE, AMEX, NASDAQ, BATS, PSE, BX, etc, …..). Partly as a result of the complexity of this fragmentation, it’s not always the case that all trades get reported in real time as they occur. Furthermore, there are some trades (e.g. large block trades between block dealers and their customers) where late reporting is permitted in any case. The full picture of trading activity gets built up over time, after the fact. So, depending on which “snapshot” each of your data sources has relied on, the data will be different. Those that report a snapshot of the real time feed will provide the best approximation of what a trader (or automated trading system) would have experienced in real time. Those that report a snapshot from a later time (after all the late trades have been consolidated) will provide the best picture of overall activity (although this is not a picture that would have been available in real time). Be aware though that both sources are susceptible to errors arising from trades reported out of sequence (“bad ticks”).


2. Where is the true volume data?

Do you mean the real volume data that would have been seen in real time, or the full picture that was available later? See above for explanation of where these differences arise.

In either case, if your interest is TA or designing/backtesting systematic strategies (where data is the main input for your analysis, and where the garbage-in-garbage-out maxim applies!), the safest place to start is often tick data, which is then “cleaned” (and possibly “compressed”, too) before reconstructing price bars, etc.

Here’s a paper on tick data cleaning
More on Bad Ticks - Financial Tick Cleaning, Compression and Migration
 
Tickmaster

Thank you very much for taking the time to answer my questions.

My questions related to end of day data.

I can now see why various data providers might provide slightly different volume figures (within a few percent), but struggle to see how the differences can be as great as the 50%+ variances that I observed, unless certain providers don't collect data from every single one of the trading platforms/venues.

I thought this issue might be the reason, and hence my second question about the best source of "true (end of day) volume data" to support price-volume analysis. If you had to go to one single source for such purposes, which would you go to?

Thanks in advance for any further assistance.
Steve
PS. The paper link didn't work.
 
Tickmaster
... If you had to go to one single source for such purposes, which would you go to?...

I would go to volume data that had been reconstructed (bad ticks removed, between clearly specified times, etc) from tick data in a way that I understood.

Tickmaster
... PS. The paper link didn't work...

Sorry the link didn't work. Here it is again with http:// omitted from the start. I tried the link a few moments ago, and it seemed fine then...
tickmaster.webs.com/apps/documents/
 
Top