Prop Trading: Learning to backtest

Abstract

Junior member
Messages
23
Likes
0
Since back testing has become an increasingly important task for trading assistants at some of the more directional based prop firms, how does an Economics major (econometrics focus + statistics minor) such as myself learn how to back test futures?

I've got three more years of school and I'm intent on making those years productive.

1- What softwares do traders at prop firms usually use to back test (mainly interested in futures and equities)?

2- If I were to learn any one or two coding languages from scratch in the pursuit of learning to back test, what should I focus on? I was thinking of C++ and Easy Language (Trade Station)

Thanks for the much appreciated help in advance.
 
Testing the backtest

Hello Everyone. I'm an algo writer, and I'm in the backtesting period of my first finished algo. I'm getting about 4% a month on penny stocks (below $8 per share, high volume stocks, commission = 0). I wrote a Python backtesting routine (It's a spaghetti mess, but I want to devote my time to the algos, not the implementation). I think the routine is decently strict: It fills orders only when twice the volume of the order has traded, and the order was placed at least one minute before). Here's the backtesting program, your flames/advice are welcome. I wanted to ask you a favor. I think I might be blind to the algo's faults, so I wanted to invite you to give me a period of time and a stock, (hi-vol pennies work best, once we get to $15 per share, the algo does no better than random), and I'll return a report with transactions and returns.
Well, thanks for any response,
J
note about the code. This is only the backtesting routine. The actual decision engine is referenced to as 'de'

Code:
import re
import csv
import os
import sys
import time
import de

def start_operation(operation,ticktime):#operation=[symbol,'long',goal,stoploss,thislatestindex,timeout, best_tunnel,living_tunnels]
	global open_orders,ordNum, cash_held_up_by_orders,testedparam
	symbol=operation[0]
	if operation[1]=='long':
		side=1
	elif operation[1]=='short':
		side=5
	goal=operation[2]
	stoploss=operation[3]
	timeout_in_minutes=operation[5]
	short_equity=0
	for i in positions:
		if positions[i][0]<0:
			short_equity=short_equity-positions[i][0]*positions[i][1]*2
	afforded=min(cash-short_equity,1000000)/int(goal)
	ordQty=afforded-afforded%100
	if ordQty>0:
		ordID='MO('+str(testedparam)+')'+ticktime+'_'+str(goal)+'_'+str(stoploss)
		thisorder={'ordID':ordID,'ordType':'market','ordSide':side,'ordSymbol':symbol,'ordQty':ordQty,'ordTime':ticktime,'ordTimeout':time.mktime(time.strptime(ticktime.split('.')[0],'%Y%m%d-%H:%M:%S'))+timeout_in_minutes*60}
		send_order(ordID)
		open_orders[thisorder['ordID']]=thisorder
def send_order(ordID):
	pass
def cancel_order(ordID):
	pass
def simulate(symbol, cell):#an order entry length of 5 means the order has not been processed. Higher length means it's alive
	global open_orders,positions,cash,days,account_value
	symbol_matches=False
	to_delete=[]
	to_cl=[]
	ticktime=cell[0].replace('-','')+'-'+cell[1]+':00.000'
	will_place_limitOrd=[]
	if symbol in positions:
		positions[symbol][1]=float(cell[5])
	for i in open_orders:
		if i[:2]== 'MO' and open_orders[i]['ordSymbol'] == symbol:#market order to open position
			volume_traded=int(cell[6])/2#I divide by 2 to harden the simulation
			t=open_orders[i]
			ordQty=int(t['ordQty'])
			ordSymbol=t['ordSymbol']
			if t['ordSide']==1:
				direction=1
			elif t['ordSide']==5:
				direction=-1
			else:
				print "unrecognized order side",t
				raw_input()
			price=float(cell[5])
			if ordSymbol not in positions:
				positions[ordSymbol]=[0,price]
			if 'filled' not in t:
				open_orders[i]['filled']=0
			transacted=min(volume_traded,ordQty-open_orders[i]['filled'])
			positions[ordSymbol][0]=positions[ordSymbol][0]+transacted*direction
			cash=cash-transacted*direction*price
			open_orders[i]['filled']=open_orders[i]['filled']+transacted
			if open_orders[i]['filled']==t['ordQty']:
				to_delete.append(i)
				to_cl.append([])
				will_place_limitOrd.append([open_orders[i],price])
		if i[:2]== 'CL' and open_orders[i]['ordSymbol'] == symbol:#limit order to close position
			if (float(cell[5]) >= open_orders[i]['ordPrice']  and open_orders[i]['ordSide'] == 2) \
			or (float(cell[5]) <= open_orders[i]['ordPrice']  and open_orders[i]['ordSide'] == 1) \
			or (float(cell[5]) <= open_orders[i]['ordStop'] and open_orders[i]['ordSide'] == 2) \
			or (float(cell[5]) >= open_orders[i]['ordStop'] and open_orders[i]['ordSide'] == 1) \
			or time.mktime(time.strptime(ticktime.split('.')[0],'%Y%m%d-%H:%M:%S'))>open_orders[i]['ordTimeout']:
				horizon=(time.mktime(time.strptime(ticktime.split('.')[0],'%Y%m%d-%H:%M:%S'))-time.mktime(time.strptime(open_orders[i]['ordTime'].split('.')[0],'%Y%m%d-%H:%M:%S')))/60
				openprice=float(open_orders[i]['ordID'].split('_')[3])
				volume_traded=int(cell[6])/2#I divide by 2 to harden the simulation
				t=open_orders[i]
				ordQty=int(t['ordQty'])
				ordSymbol=t['ordSymbol']
				if t['ordSide']==1:
					direction=1
				elif t['ordSide']==2:
					direction=-1
				else:
					print "unrecognized order side",t
					raw_input()
				price=float(cell[5])
				#determine whether this is a win or a loss:
				global daylongwins,daylonglosses,dayshortwins,dayshortlosses,accumlongprofit,accumlonglosses,accumshortprofit,accumshortlosses
				openprice=float(t['ordID'].split('_')[3])
				invested=ordQty*openprice+commission
				if open_orders[i]['ordSide'] == 2:
					earned=(price-openprice)*ordQty
					if earned>0:
						accumlongprofit=accumlongprofit+earned
					elif earned<0:
						accumlonglosses=accumlonglosses+earned
					if price > openprice+commission/ordQty:
						result='longwin'
						daylongwins=daylongwins+1
					else:
						result = 'longloss'
						daylonglosses=daylonglosses+1
				elif open_orders[i]['ordSide'] == 1:
					earned=(openprice-price)*ordQty
					if earned>0:
						accumshortprofit=accumshortprofit+earned
					elif earned<0:
						accumshortlosses=accumshortlosses+earned
					if price < openprice-commission/ordQty:
						result='shortwin'
						dayshortwins=dayshortwins+1
					else:
						result = 'shortloss'
						dayshortlosses=dayshortlosses+1
				transaction_return=earned/invested
				if ordSymbol not in positions:
					print "attempting to close a nonexistent position"
				if 'filled' not in t:
					open_orders[i]['filled']=0

				transacted=min(volume_traded,ordQty-open_orders[i]['filled'])
				positions[ordSymbol][0]=positions[ordSymbol][0]+transacted*direction
				positions_value=0
				for j in positions:
					positions_value=positions_value+positions[j][0]*positions[j][1]
				cash=cash-transacted*direction*price-commission
				account_value=positions_value+cash
				transactions_file.writerow([t['ordID'],ticktime,result,ordQty,transacted,openprice,price,account_value,horizon,invested,transaction_return])	
				transfile.flush()				
				open_orders[i]['filled']=open_orders[i]['filled']+transacted
				if open_orders[i]['filled']==t['ordQty']:
					cash=cash-commission*2
					print days,positions,cash,"{:0.2%}".format((account_value/startcash)-1)
					to_delete.append(i)
				
	for i in to_delete:
		del open_orders[i]
	for i in will_place_limitOrd:
		place_closing_limit(i[0],ticktime,i[1])

def place_closing_limit(mkt_open_dict,ticktime,openprice):
	opening_side=mkt_open_dict['ordSide']
	if opening_side==1:
		closing_side=2
	elif opening_side==5:
		closing_side=1
	ordID=mkt_open_dict['ordID'].replace('MO','CL')+'_'+str(openprice)
	price=float(ordID.split('_')[1])
	stop_price=float(ordID.split('_')[2])
	symbol=mkt_open_dict['ordSymbol']
	ordQty=mkt_open_dict['filled']
	ordTimeout=mkt_open_dict['ordTimeout']
	thisorder={'ordID':ordID,'ordType':'limit','ordSide':closing_side,'ordPrice':price,'ordStop':stop_price,'ordSymbol':symbol,'ordQty':ordQty,'ordTime':ticktime,'ordTimeout':ordTimeout}
	send_order(ordID)
	open_orders[thisorder['ordID']]=thisorder	


#files=os.listdir("/home/alt/Projects/stock_data/2/")
thisfile='UMC'
symbollist=re.findall('[A-Z]+',thisfile)

a= csv.reader(open("/home/alt/Projects/stock_data/2/"+thisfile+".csv","r"), delimiter=",")
transfile=open("/home/alt/Projects/algo/concerto/logs/"+time.strftime('%y%m%d%H%M',time.gmtime(time.time()))+symbollist[0]+'trnsc.csv','w')
transactions_file= csv.writer(transfile, delimiter="\t")
resfile=open("/home/alt/Projects/algo/concerto/logs/"+time.strftime('%y%m%d%H%M',time.gmtime(time.time()))+symbollist[0]+'results.csv','w')
results_file=csv.writer(resfile, delimiter="\t")
transactions_file.writerow(['OrdID','Ticktime','Side','ordQuan','Quantity','Open Price', 'Close Price','Account Value','Horizon','invested in trnsc','Return'])
results_file.writerow(['Ticktime','Long Wins','Long Losses','Short Wins','Short Losses','Account Value','Day Return','Accum Return','longprofit','longlosses','shortprofit','shortlosses','longnet','shortnet'])
print "opened",a
lines = [line for line in a]
lines=lines[::-1]

de.initialize(symbollist,False)
counter=0
ordNum=0
operations={}
startcash=30000
cash=startcash
previous_day_account_value=cash
account_value=cash
commission=0#4.95
cash_held_up_by_orders=0
open_orders={}
triggers=[]
positions={}
last_day=32
daylongwins=0
daylonglosses=0
dayshortwins=0
dayshortlosses=0
accumlongprofit=0
accumlonglosses=0
accumshortprofit=0
accumshortlosses=0
days=0
for i in lines:
	counter=counter+1
	ticktime=i[0].replace('-','')+'-'+i[1]+':00.000'
	#if counter%1000==0: print counter
	operation,testedparam = de.process_tick(symbollist[0], i[5], float(i[6]))
	simulate(symbollist[0],i)
	timea=time.strptime(ticktime.split('.')[0],'%Y%m%d-%H:%M:%S')
	if operation and len(operation)>0 and timea.tm_hour <14:
		start_operation(operation,ticktime)
	#day calculations
	day=timea.tm_mday
	if day!=last_day:
		days=days+1
		ret=(account_value/previous_day_account_value*1.0)-1
		acc_ret=account_value/startcash
		results_file.writerow([ticktime,daylongwins,daylonglosses,dayshortwins,dayshortlosses,account_value,ret,acc_ret,accumlongprofit,accumlonglosses,accumshortprofit,accumshortlosses,accumlongprofit+accumlonglosses,accumshortprofit+accumshortlosses])
		resfile.flush()
		daylongwins=0
		daylonglosses=0
		dayshortwins=0
		dayshortlosses=0
		previous_day_account_value=account_value
		last_day=day
 
Last edited:
I do all my backtesting going forward .....

N
 
... but backtesting takes 1 minute, forward testing takes 1 year.
 
... but backtesting takes 1 minute, forward testing takes 1 year.

Adamus,
I agree with you on importance of backtesting and the speed at which it can be carried out. I used to dedicate much more time to back testing until I read Mark Douglas's book Trading in the Zone. Just that back testing is purely hypothetical and only by forward testing can one validate and verify the results obtained in back testing.

Just my humble opinion.
 
Haha! I totally agree. After 3 messages with only one short line each, I wasn't going to launch into a encyclopedic defence of backtesting. I just wanted to point out that backtesting is not a useless activity.
 
If you back test properly, then forward testing is not required. Just trade the system.
 
That's assuming you've got perfect data, assuming you've got the perfect trading software, assuming you've seen all the mistakes your broker can make on your behalf, assuming you never ever make mistakes yourself.....
 
Top