In fact, for the Explosivepip strategy there are only 3 conditions and 2 outputs, so we would need only 16 bins, and 1000 samples are probably enough.

However in the pipMaximizer strategy there are 9 conditions and 2 outputs, so we would need 256 bins, and with 1000 samples we have only 4 counts in each bin on average, so the estimate would probably be poor indeed if the pdf is at all spread out.

It would be interesting to compare the performance of an SVM to a histogram at least for the ExplosivePip strategy.

We might simply use more samples and return to counting things, but then we might run up against the inherent non-stationarity in Fx series. By the time we got to 25600 samples we might have lost the local pdf, and find only a pdf averaged over too much time. So maybe the SVM is a good idea after all. It will be a particularly good idea if we ever get to continuous inputs and outputs.

Incidentally, I have been trying to understand the hasline.m code. It seems to me that this is a count of the number of bars that cross a line defined by the mid of a bar. This is not the same as the article referenced in the code comments, where the idea is to find price lines with the minimum number of crossings. Furthermore, the result is counted over a period from 1 to t-1 which increases with t. So as t increases this count will increase, and the likelihood that it will exceed M will increase, until after a couple of hundred samples it seems that this indicator is very likely to be 1 and carry no information. Can someone check me on this please?