ChatGPT and New Backtesting Algorithm

AriaS

Well-known member
Messages
374
Likes
123
In the last 3 days, ChatGPT helped me build a new backtesting algorithm that is much better than the one I was using before. The key improvement is this: instead of hunting for some perfect setups that don't have drawdowns, the algorithm welcomes them and explicitly evaluates how the setup recovers from them. And also the total Out-of-Sample period grew from 3 months to 27 months, which gives much more realistic picture of behavior across different regimes.

The process now looks like this:

  1. Optimization (3 months by Recovery Factor).
  2. Recent Out-of-Sample (9 months). The most important phase. Here I strictly filter setups by RF grades and recovery behavior (frequency and duration). The latter is analyzed by GPT. Grades: >=2.0: excellent; 1.5-2.0: good; 1.2-1.5: weak; =<1.2: reject.
  3. Long OOS (full year before the recent OOS). This phase is used to understand robustness and regime sensitivity: >=1.3: robust; 1.0-1.3: regime-sensitive; =<1.0: fragile. A weak result here does not automatically reject the setup, but it signals higher risk and affects position sizing.
  4. Stress tests (Covid, Ukraine invasion): the purpose here is survival only. The setup is rejected only if recovery logic breaks and DD becomes unbounded.
  5. Repeat steps 1-4 every 1-2 months.
Now it looks like professional quant and hopefully this will improve the result that I am currently not so happy about: RF = 1.6 in 8.5 months.
 
Last edited:
Given the context (use of ChatGPT to improve backtest) I thought I would use
Chat GPT to debate the question. It seemed reasonable to me. Copying the text
of our initial "debate" below. Here is ChatGPT's response to my
inquiries
----------------------------

Chat GPT responds

How We Can Proceed (Your Choice)

We can take this discussion in any of these directions:
  1. AI vs VWAP / Value / Auction-based frameworks
    (Why rule-based market structure often outperforms ML at the decision layer)
  2. Where AI actually does work in trading
    (Execution, liquidity prediction, volatility surfaces, options)
  3. Failure case studies
    (Why “great backtests” implode live)
  4. Designing a hybrid institutional framework
    (Human regime + rule-based structure + AI as a subordinate module)
  5. Formal debate format
    (Pro-AI vs Anti-AI arguments, steel-manned on both sides)
Just tell me how you’d like to further structure the debate. (I stopped at this point)

-----------------------------------------

I can move this to another thread, but before doing so, I wanted the original poster to see what apparently
they have not considered.

Good Luck
 
Last edited:
Back
Top