rolling your own

temptrader

Well-known member
Messages
393
Likes
55
I don't know if this has been covered but I have a silly problem:

1) I only use price and volume
2) I trade the DOW at the moment

I really, really don't care about all the fancy cr*p that some charting packages have. I'm running Ensign at the moment (and at $40 a month it's OK) with a live feed. Point is, all I want is Price and Volume, why should I be paying for the rest (and on subscription too). So I'm thinking about writing my own charting package for my needs (I can program quite well, use to work as a programmer but left to trade full time). I am considering this for the following reasons:

1) Sometimes the updates for Ensign has bugs (the cheek of it, could you imagine releasing updates for mission critical software with bugs), and I've lost on trades because of this. Not something I want to have to go through.
2) I want to have more control on how my charts look, in fact total control.
3) Most of the charting packages are bloated to death adding to their complexity, which in turns add to more potential bugs.
4) Why should I be paying for add ons, indicators, fancy lights that I never have the intention of ever using?

So, how much grief would it be for me to write my own? What APIs and other resources will I need to read up on?
 
If you use IB, consider Sierra. It's only $10/mo, and the datafeed -- if IB -- is free.

Db
 
You could even use Excel with IB so that would cost you nothing as long as IB make $30 in commissions per month which is not much.


Paul
 
Or QuoteTracker at $7 per month. Or free if you can cope with the ads.
 
It would be a lot of work to do your own. More than you anticipate. As always the devil is in the detail and as always with such things the detail only starts to become apparent when you get your hands dirty.

I've developed my own real time charting and IMHO it is pretty good, very capable and quite high performance but it was a lot of work. The advantage is that the code base becomes reusable for things other than charting such as market scanners, real time alerts, auto trading etc etc. That is when the payback really starts to cut in. You also can achieve almost complete vendor independence and fairly easily adapt to different brokers, feeds and markets.

I was originally motivated by disgust with eSignal and the crappy scripting language. I started with just a little prototype and played around with it for a while to get a feel for the magnitude of the problem.

If all you want is some charts of the DOW, I would categorically say, get QuoteTracker or similar. It is not worth the work to roll your own.

I you do want to do it my recommendation is to do it in Java. Several reasons:

1. Easier and faster to develop than C++. Probably more robust - at least when coded by programmers of average ability. If you use C++ get and use a "memory abuse checker" such as Purify, Insure++, BoundsChecker, or the free Valgrind or Electric Fence.

2. Sufficient performance. I can easily run 50 or more real time charts concurrently with my Java code. The amount of rubbish talked about Java performance is unbelievable.

3. Stable, robust and mature environment.

4. Cross platform. I run on Linux - free, robust, and secure. Much easier to maintain a reliable *nix machine or machines than Windows boxes. No reainstall the operating system crap every 6 months, no need to run virus checkers. Just stick it behind a hardware firewall, turn off unnecessary services (default on modern distros) and you are pretty safe.

5. Lotsa free Java library code available. Java is the No. 1 language on Soureforge.
In particular for charting there is JFreeChart library from http://www.jfree.org. It's pretty good and very flexible - you can get your charts to look like whatever you want. JFreeChart is extensively used in may Java applications. Not really designed for real time, but its possible to bend it to the task with a little work.


There are also a number of free open source trading applications written in Java that you might be able to use or modify for your purposes. Do a seach over on ET. Somebody posted a comprehensive list fairly recently.
 
Dcraig,

That's really impressive. You're still depenent on a price feed but how does this translate to reality; don't you still have to go with the ESignals, etc of this world?

Out of curiosity, which file (a dll?) is the conduit, so to speak,through which prices are sent to your desktop apps (from an interent accessed quite vendor)?

I could run double on the standard pc under NT in 1997 (simultaneously Reuters, Excel, two charting programmes, around 30 charts) compared to the bloated software/systems of today.

Grant.
 
Dcraig,

That's really impressive. You're still depenent on a price feed but how does this translate to reality; don't you still have to go with the ESignals, etc of this world?

Out of curiosity, which file (a dll?) is the conduit, so to speak,through which prices are sent to your desktop apps (from an interent accessed quite vendor)?

I could run double on the standard pc under NT in 1997 (simultaneously Reuters, Excel, two charting programmes, around 30 charts) compared to the bloated software/systems of today.

Grant.

The only feeds I'm using at the moment are IB TWS and OpenTick. I did write a bit of code to use the QuoteTracker http thing, but I thought it was pretty trashy, so I abandoned it.

Yes, you are stuck with the quote vendors. To integrate with Windows only stuff I would have to write some sort of bridge and run the bridge on a Windows box - something I hope I wont need to do. DTN have some sort of Java API, but I think it's a wrapper around COM, and I'd rather not, thank you very much. Esignal have two APIs but only one available to retail traders - the desktop API and some very unkind things have been said about it.

If you want to pony up the $, Comstock have a proper native Java API, but it costs. Maybe I will look into this sometime down the track.

For the cost ($4 per month for all US stocks and options real time L1 and unlimited history) OpenTick is unbelievable value. I like it's approach. API source code and protocol spec freely available. I recently converted from the OpenTick Java API to one developed by one of their users. A fine bit of work it is too - better designed and very robust. This is the strength of open source. If the vendor is not fixing it - somebody else will.

OpenTick real time seems to lag a little behind IB, but it's not bad at all. There is also RTD support for Excel available. You can get your hands dirty for next to no cost. It also has historical book data and I have not heard of this being available anywhere else.

No DLLs involved. (On *nix systems the equivalent is shared objects - *.so files). Just library code that talks to either the quote providers servers (as in OpenTick) or a local gateway (TWS).

When it comes to bloat, I'm the king. It makes no sense these days to design for small memory use. I always trade off more memory for faster execution or faster development whenever possible. 8 gig desktop is very reasonably priced and if you want more you can get a server type motherboard and suitable memory. Having said that I have just 2Gb in each of my 2 machines. Looking to put 6 Gb in one, but there seems to be some problem with the 4Gb Corsair I just bought. Incidently you need a 64 bit OS for 4+ Gb. I use Ubuntu 64 bit.

Overall, I'm well pleased with what I have developed but the amount of work has been enormous. It's coming together now with proper integration of EOD scanner, RT scanner, real time sorting of quote sheets ( Radar Screen type stuff but a lot faster) and so on. I'm gradually reworking the user interface which is a bit horrible due to my inexperience in this side of things (my background was almost solely server side development).
 
Last edited:
Dcraig,

I thought memory/motherboard configurations only allowed matching pairs of memory – 2x1, 2x2, 2x4 (if available). I didn’t think 6 meg was possible.

I using XP Pro 64 bit, with AMD 64 4800 processor, 2 gb memory. Will I see any improvement (in Excel) with 4 gb? I’ve also got a spare identical hard drive. Should I fit that as an array to improve performance? Be interesting to see how the new AMD quad core shapes up against Intel.

I remember hearing quite recently of a UK university using hundreds of old 486 chips (?) to build a massive pc utilising parallel processing. Apparently, it like a super –computer.

Grant.
 
My intended config was 2 x 1 Gb + 2 x 2Gb. I think it should work, but I try not to have to know too much about PC harware, so I could be wrong. In any case I want to bring the memory up to 4 x 2 Gb and use the spare 2 gig elsewhere. The box is unstable with just 2 x 2 gb Corsair so it's probably defective memory. I'll flash the BIOS to the latest rev before complaining just in case.

Raid 0 array will improve disk IO, but you increase your risk of HW failure. If either disk fails, you are gone. Probably not a good idea for the disk where your OS lives.

For reliability you can use Raid 0+1 - striping and mirroring, but you need 4 disks. Or Raid 5 for redundancy and possible performance improvements, but you need 3 disks. I have Raid 1 - mirroring - for my database server using Linux software raid. The latter probably outperforms most hardware raid and especially the crappy so called raid controllers on a lot of motherboards.

Impossible to say if more memory will improve Excel performance. If the hard disk activity light is on a lot when spreadsheets are 'active' eg recalculating, you need more memory. Otherwise may not help much.

Yes, there's quite a few of these massively parrallel systems around, though I doubt the wisdom of using a bunch of old 486s when you factor in the cost of power, cooling and space. One Intel quad core QX6600 will do the work of quite a few 486s. Typically these systems need specialised programming - you can't just dump an ordinary application on them and expect it to perform any diffrerent than on a single box. A lot of these types of systems run Linux. Microsoft has tried to muscle in, but Linux (and some other *nixes) completely dominate.
 
dbphoenix:

cost is not really an issue, it's just that I don't like being tied to a piece of software whose features I hardly used. The analogy of using a sledge hammer to crack a peanut is not too far off. I I note what dcraig1 says about bloat but I HATE bloat (probably because I'm a amateur mathematician and I don't like unnecessary things that should not be there). I remember one upgrade of Ensign that resulted in one chart freezing (it was the S+P, I actually use only 4 charts, 2 for S+P and 2 for DOW) and this cause me to lose some money on trades before I realised what was going it. That is totally, and utterly, unacceptable. I called them up and they gave me a fix but it shouldn't happen in the first place because we are dealing with real money here (I'm not paper trading). How would you feel if some software company rolled out updates to their life support machine software before testing?

Paul:

I don't even want to go anywhere near Excel. Most amazing bloated piece of crap there is. Did you know it's got a 2^16 (=65,536) row limit? This in the age of what we have with computers now.

I do more than $100 in commissions with IB (only been with them for 6 weeks nows) so getting their data is not an issue. I still choose to use DTN IQfeed simply because I want a second opinion. You ask why? Well lets say:

1) there's been time that the IQfeed has stalled (it was a problem with their servers)
2) very rarely IB's prices have stall too? It only happens for maybe 10 - 20 seconds or so but it has happened

so a second opinion is always a good thing in my book.

dcraig1:

thanks for the reply. I can program in Java - but rather not if I could help it. I hate all this object oriented crap (call me a dinosaur), it's just a good excuse to get more people into programming, speed up productivity, but quality goes down the drain. I take it you use AWT or swing? I'm thinking of using Mac OS X and develop it under XCode using basic C. This might not be feasible since most of the datafeed APIs are java based. I would like to know where I can get some example prototype programs to get started.

I can deal with the fact that it's going to me a lot of work since I've done stuff that was a lot of work in the past.
 
dbphoenix:

cost is not really an issue, it's just that I don't like being tied to a piece of software whose features I hardly used.

Paul:

That's why I suggested the standard version of Sierra. But if even that's too much, it would not suit your needs.

Db
 
On programming languages:

If you have some "C" experience take a look at "D": http://www.digitalmars.com/d/index.html

It's a very little known language but easier to use than C (and especially easier than C++) but with similar performance. Can call any C (but not C++) function located in a DLL. Much nicer than Java and no virtual machine overhead.

It has classes etc. but you don't have to do OO programming.
 
gc1, I really don't care about the language, it's just a way of getting the job done. Since more and more data vendors are moving over to Java based API it looks like I have not choice.

Java's a nice language - and I'm not here for a language discussion since that would just be incredibly boring and tedious - and if that's what most of the brokers/datafeed providers are using I probably have no choice but to submit.
 
Dcraig,

My knowledge is basic but I’m 95% certain there has to be consistency re memory:

2x512k, 2x1mb, 2x4mb (or 4x multiples). I think 4 memory slots is the max on current mother boards but I stand corrected.

Temptrader,

The quote system retail end of the market is not Reuters or Bloomberg; I reckon you get what you pay for (I pay £58 per month for mine).

Hasn’t Excel 2007 doubled the number of columns and rows? Excel is bloated; I’m surprised no-one has produced an alternative “lite” version. I’m sure it wouldn’t be difficult for a suitably qualified programmer.

Grant.
 
Dcraig,

My knowledge is basic but I’m 95% certain there has to be consistency re memory:

2x512k, 2x1mb, 2x4mb (or 4x multiples). I think 4 memory slots is the max on current mother boards but I stand corrected.

Temptrader,

The quote system retail end of the market is not Reuters or Bloomberg; I reckon you get what you pay for (I pay £58 per month for mine).

Hasn’t Excel 2007 doubled the number of columns and rows? Excel is bloated; I’m surprised no-one has produced an alternative “lite” version. I’m sure it wouldn’t be difficult for a suitably qualified programmer.

Grant.

32 bit computers have a 4 GB memory limit because of the 32 bit memory addressing (2^30 is 1 GB (1024 x 1024 x 1024), hence 2^32 is 4 GB). However some operating systems/hardware can get around this with wider memory addressing (the company I use to work for ran 32 bit linux on large machine with 32 GB of memory!!).

For personal computers 4GB is plenty. If you need to go for more run a 64 bit version of the operating system. XP has 64 bit version and most newer computers running Core 2 Duo support 64 bit.

I'm not so well versed in all this since I've left work a few months ago, if in doubt ask around places that make custom PCs.
 
temptrader,

Thank you for the technical clarification re memory.

Perhaps you could explain – briefly and simply if possible – the difficulties of producing greater memory addressing?

Re your previous company, what were they doing? Why the enormous memory but only 32-bit?

To reiterate, I run AMD 64 dual core on XP 64 with 2 gb memory. With the exception of dde’s, there is no difference between what I was doing when I first used spreadsheets (1990, no hard disk, 64k memory, can’t remember the processor) and now, except I need (relatively) massive computing power to do the same thing.

And what is the priority for spreadsheets – processor, memory, motherboard? Is it the case that while calculating, the processor moves information in and out of memory. Therefore, while the processor may be adequate, memory may not be; similarly, if there are too many calculations, the processors slows and the large memory
is actually waiting. So what we are seeing are in effect bottlenecks.

Dcraig,

re the 486’s, this was a purely academic – as opposed to a practical - exercise. Still fascinating. I remember seeing a tv programme re a home-based trader in New York. He did the same thing which resulted in a pc whose power far exceeded anything available at the non-corporate level.

Grant.
 
32 bit memory gives you 4GB limit. 64 bit memory addressing gives you 2^34 GB limit (which, to all intents and purposes is more that enough for anyone, not even all the supercomputers on this earth has this amount of memory)

Row limit on Excel spreadsheet is not a hardware issue, it's a software issue.

You say there is not difference? What else are you running? Since you are running true 64 bit then you should be able to use more memory. As the people who you bought your motherboard from to see how much more memory you can get, I definitely sure that you get the over the 4 GB limit. I'd like to know what kind of computations you are doing and on how many rows you are doing them along with the complexity columns.

There's also another issue with memory, the operating system can elect to write to virtual memory (a partition on the hard drive solely allocated for this) slowing access down, even though there is sufficient memory available, which is why you see plenty disc activity when the machine gets memory intensive. I'm sure there is someway to cut this off now, but it was never in the controls of the programmers, it was an OS thing.

The effect you give about not being much difference between the old days with a 486 is exactly why I am thinking of writing my own charting package, because that's what you get when the OS and other software are bloated to death. It's an economic thing really, if MS Excel had more features it would sell to more people. Third party companies come on introducing new stuff that they want etc. . . It's like the red book for Blu Ray, the original spec manual agreed amongst manufacturers for DVD was probably about an inch thick, for Blu Ray it's probably about 6"!! because all the players want to put forward their suggestions which may/may not benefit the consumer.

Your last point about processor slow down is basically the wrong end of the stick. We are moving to more cores, not faster speed. This means that you can do MORE at the same speed, but it does not necessarily mean that it gets faster. A normal pentium 4 running Excel will be the same as a dual core pentium running Excel but you are get away with running extra programs on the latter without effecting performance much.

To get the major increase in speed in the age of mutlicores you have to change the way the software is written or have specialised optimizing compilers, and that's another topic all together.
 
Temptrader,

Here’s an example of calc’s/formulas I’m using on an Excel sheet. This is from a table with numerous cells/columns covering the last 45 minutes (45 being the approx number of rows visible on my screen).

cell T3: =SUMIF(B$3:B$20000,S3,I$3:I$20000)

The ranges refer to tick data – time, bid, ask, etc. Data in column B is Time (of trade), in column I it is Buy volume. S3 refers to a specific time, eg 9:45.

The formula returns the volume of trades at (eg) 9:45. The next cell (T4) will return the same for 9:46, and so on.

My interest here is not the actual volume but whether it is increasing or decreasing, so in U4 I have T4-T3 to give the net change.

Maybe you could give an opinion on this with regard to the above. I am using two cells to calculate a figure. Is it more processor/memory efficient to use just one cell to achieve the same result, eg the net in one cell would be given by:

=SUMIF(B$3:B$20000,S4,I$3:I$20000)-SUMIF(B$3:B$20000,S3,I$3:I$20000)

Grant.
 
sorry for late reply, very, very busy.

the problem with Excel is that each cell takes up memory and comes (and I'm justing assuming here from a programming point of view) with a lot of "baggage" with each cell (since you can reference them from other cells and assign funny functions and styles to them), all this means a complex data structure for the cell in question. So this means that if you were to have it hold just an integer (up to say a 64 bit integer), it uses more that just the allocated 64 bits (because we need extra for the all other craps like location, style, script/function/formula embedded etc), and this for each and EVERY cell. So when you do your calculation it has to make multiple calls/retrievals etc. . . slowing things down considerably.

Yes, using less cells speeds things up. Also you are summing, and summing is cheap (computationally that is). If you really want to speed what you want to do up save it in it's own dedicated format and write a program to do it, it will do it many, many times faster.

How big are your sums? Are we summing 10000 different cells? 1,000,000? If you are going of that order best to go via the programming route. Most chart packages saves data in their own files (not Excel, although they allow you to export to it) for efficiency of calculation if it is required. Excel is great for presentation and if your dataset is small, but it's best to write your own if you need to process lots of data and as quickly as possible.
 
Top