Since I did not follow this in a previous thread please could you elucidate?

Sure.

I am refining an automated pairs trading system which scans the prices of every potential pair from a universe of stocks (S&P 500 in this case).

It then reduces these down to those which have a > x% correlation over y days and an a% correlation over b days (where b<y).

Then it sorts these pairs into those with a gradient of the line of best fit of between m and n (low as trending spreads are perceived to be more risky).

Finally it assesses the previous speed of mean reversion and attributes probabilities to mean reversion of diverged spreads and ranks these.

The probabilities are based on the historic distribution of that spread and a few technical indicators.

Once this is done the program (will, not quite there yet) outputs which spreads to buy and sell and where to place the stop.

As the system is based on the supposition that historic mean reversion / correlation will persist it uses tight stops (so the loss when the relationship does not hold is half of that when it does).

This way gains are always twice the size of losses and occur more frequently.

It has taken time to get to this but I am delighted with the results! It remains computer intensive and needs refinement but I am close!

gbr128 said:This is exactly the kind of system I had been hoping to develop, but haven't had the opportunity yet as I've been working full time and also part time so no time left!

Are there any more details you're willing to post?

I am happy to divulge a certain amount, but not 100% as it works so well!

What would you like to know?

So I guess my question is how you calculate your line of best fit? And do you then use that as the trigger point or do you use the mean of the spread?

What product do you use to scan?

How do you populate that scanner with stocks in the first place (i.e. on what basis/criteria do YOU decide to enter a stock for future processing (or is that automated too)?

And how do you determine what is a potential pair on a dynamic basis?



For a comprehensive discussion of how to calculate the line of best fit see http://www.statsoftinc.com/textbook/stmulreg.html

The slope of the line of best fit is a requirement for any spread to be included in my universe of candidates. Trigger points are then generated according to deviations from the line conforming to a few basic technical criteria.

scans the prices of every potential pair from a universe of stocks

I am building my own product in Java. It is not finished yet, at the moment it is a just a VBA application.

What I did was fill two workbooks with the last 5 years of daily price data from the S&P 500 (a pain in the arse, I tried doing it manually from Datastream at first before finding that Bloomberg does it really quickly).

The stocks entered for processing are just those in the S&P 500.

Doing it dynamically is currently very clumsy as I have to re enter the last 5 years of data every day direct from Bloomberg. Once my Java application is up and running I will use a live feed into a sql database and the data will be continuously scanned on a dedicated PC in there.

I am still exploring the relative merits of sector neutrality. Being sector neutral offers fewer triggers but they are right more often. We will see what the backtest shows up when it is complete...

