GBM source code in C++ for generating time-series

botpro

Guest
Messages
21
Likes
0
GBM source code in C++ for generating time-series (ie. stock prices)

Here is some source code I wrote a while ago for generating time series (ie. stock prices) using Geom. Brownian Motion (GBM).
GBM is not random data, it is simulated market data (some call it "synthetic market data").
Where can you get say 5-second bar data for hundreds of stocks for say 1000 years?
With GBM it is possible to test 1000s of market situations fast and with no costs.
It is very good for market modelling and forwardtesting in simulations.
And: you even don't need to store the mass data in databases; just store the "seed number" which is just a single 32bit or 64bit value depending on the operating system.
Ie. in the below code one would at program start in main() do something like this:
Code:
   randgen.seed(12345);
It means that you can replay/repeat the same sequence it generates. This is important during system design and development (ie. for debugging the system).

Code:
/*
GBM.cpp - Geom. Brownian Motion (GBM)
2016-04-21-Th: v1.00a: chgd author address
2016-01-26-Tu: v1.00:  now t-distribution can be activated via the last param in ctor
2016-01-23-Sa: v0.99:  initial version
Author: U.M. in Germany (user botpro at www.trade2win.com, formerly user botpro at www.elitetrader.com)

What-it-does:
   Create timeseries data (ie. stock prices) using Geom. Brownian Motion (GBM).
   Can generate bars of any size in time. By default it generates 30-sec bars (ie. 780 bars/day @ 23400 seconds/day).
   You can modify it easily to create OHLC-data (intraday and EOD) to be used in trading platforms like AmiBroker etc.

Compile using a C++11 conformant compiler like GNU g++:
   g++ -Wall -O2 -std=c++11 GBM.cpp -o GBM.exe
   
Run:
   Linux/Unix: ./GBM.exe >data.csv
   Windows:    GBM.exe >data.csv
   
Analyse:
   Import data.csv into Excel or LibreOffice-Calc and do some analysis (calcs, charts etc).

   Remember: the stock returns (ie. the logarithmic changes) are normally distributed, but the resulting timeseries
   is log-normally distributed because there are no negative stock prices.

   By default it uses the normal-distribution. t-distribution can be used optionally (see last param of the ctor).
   Using the default normal-distribution is the stochastically correct method for research.
   For the difference see [2], sections "Normally Distributed Model of Asset Returns" and "Leptokurtic Model of Asset Returns".

   The quality of the generated data can be verified with the following formula:
     ObservedAnnualVolaPct = BarVolaPct * sqrt(252 * nBarsPerDay)
   ie. in Excel/LibreOffice-Calc for the sample data the pgm creates do this:
     =STDEV(D2:D16381)*100*SQRT(252*780)
   It should give approximately the same VolaPct as was specified as the input volatility (ie. here 30).
   
Misc:
   - You can modify the code easily to create OHLC-data (intraday and EOD) to be used in trading platforms like AmiBroker etc.
   - It works with trading days instead of calendar days, and a year is defined as 252 trading days (can be chgd in ctor)
   - This code is a stripped down standalone usable version of my TCIntradaySpotGenerator
   
See also / References:
   [1] https://en.wikipedia.org/wiki/Geometric_Brownian_motion
   [2] https://mhittesdorf.wordpress.com/2013/12/29/introducing-quantlib-modeling-asset-prices-with-geometric-brownian-motion/
   [3] https://people.sc.fsu.edu/~jburkardt/cpp_src/brownian_motion_simulation/brownian_motion_simulation.html
   [4] http://www.javaquant.net/books/MCBook-1.2.pdf
   [5] http://investexcel.net/geometric-brownian-motion-excel
   [6] https://en.wikipedia.org/wiki/Volatility_(finance)
*/

#include <cstdio>
#include <cstdlib>
#include <cmath>
#include <random>
#include <chrono>

using namespace std;


default_random_engine randgen(chrono::system_clock::now().time_since_epoch().count());
normal_distribution<double>    n_dist(0.0, 1.0);  // mu=0, s=1
student_t_distribution<double> t_dist(5);         // 5 degrees of freedom

   
class GBM
  {
    public:
      const size_t uDays, uDailyBars;
      const double dbSpot0, dbAnnDriftPct, dbAnnDividPct, dbAnnVolaPct, dbTradeDaysInYear;
      const bool   fUseTdistribution;

    private:
      double r, q, u, t, dt, sigma, SD, R;
      double S_t, S_tPrev;
      size_t cGen;

    public:   
      GBM(const double AdbSpot0 = 100.0, const double AdbAnnVolaPct = 30.0,
          const size_t AuDailyBars = 780, const size_t AuDays = 252,
          const double AdbTradeDaysInYear = 252.0,
          const double AdbAnnDriftPct = 0.0, const double AdbAnnDividPct = 0.0,
          const bool AfUseTdistribution = false)
       : uDays(AuDays), uDailyBars(AuDailyBars), dbSpot0(AdbSpot0),
         dbAnnDriftPct(AdbAnnDriftPct), dbAnnDividPct(AdbAnnDividPct),
         dbAnnVolaPct(AdbAnnVolaPct), dbTradeDaysInYear(AdbTradeDaysInYear),
         fUseTdistribution(AfUseTdistribution)
        {
          r     = dbAnnDriftPct / 100.0;
          q     = dbAnnDividPct / 100.0;   // dividend yield
          u     = r - q;
          t     = double(uDays) / dbTradeDaysInYear;
          dt    = t / double(uDays * uDailyBars);
          sigma = AdbAnnVolaPct / 100.0;
          if (fUseTdistribution)
            sigma = sqrt(sigma * sigma * 3 / 5);    // scaled by reciprocal of Student T variance (v/(v-2))
          SD    = sigma * sqrt(dt);
          R     = (u - 0.5 * sigma * sigma) * dt;   // Ito's lemma
          S_t   = S_tPrev = dbSpot0;
          cGen  = 0;
        }
       
      double generate()
        { // convention: the very first spot is the initial spot
          S_tPrev = S_t;
          if (!cGen++) return S_t;
          if (!fUseTdistribution)
            S_t *= exp(R + SD * n_dist(randgen));   // normal distribution (Gauss)
          else
            S_t *= exp(R + SD * t_dist(randgen));   // t-distribution ("fat tails")
          return S_t;
        }

      double get_cur()  const { return S_t;     }

      double get_prev() const { return S_tPrev; }
  };


int main()
  {
    // define the input params to use in GBM:
    const double dbSpot0     = 100;  // start with this stock price
    const double dbVolaPct   = 30;   // historic volatility
    const size_t nBarsPerDay = 780;  // ie. 30-sec bars @ 23400 trading seconds per day

    GBM G(dbSpot0, dbVolaPct, nBarsPerDay, 252, 252.0, 0.0, 0.0, false);
   
// fprintf(stderr, "Using: %s\n", G.fUseTdistribution ? "t-distribution" : "normal-distribution");
   
    // create bar data for 21 days (= 1 trading month) and print as CSV:
    printf("Day,Bar,Spot,lnOfChg\n");
    for (size_t d = 1; d <= 21; ++d)
      for (size_t b = 1; b <= nBarsPerDay; ++b)
        {
          const double dbCur  = G.generate();
          const double dbPrev = G.get_prev();
          printf("%zu,%zu,%.5f,%.10f\n", d, b, dbCur, log(dbCur / dbPrev));
        }

    return 0; 
  }
 
Last edited:
FYI:

- The param AnnDriftPct means AnnualEarningsYieldPct

- The code lacks gap generation as seen in real markets (overnight gaps etc.).
That feature, called jump-diffusion (ie. then it becomes a noncontinuous process), will be added in the next version.


.
 
Last edited:
Top