AriaS
Well-known member
- Messages
- 374
- Likes
- 123
In the last 3 days, ChatGPT helped me build a new backtesting algorithm that is much better than the one I was using before. The key improvement is this: instead of hunting for some perfect setups that don't have drawdowns, the algorithm welcomes them and explicitly evaluates how the setup recovers from them. And also the total Out-of-Sample period grew from 3 months to 27 months, which gives much more realistic picture of behavior across different regimes.
The process now looks like this:
The process now looks like this:
- Optimization (3 months by Recovery Factor).
- Recent Out-of-Sample (9 months). The most important phase. Here I strictly filter setups by RF grades and recovery behavior (frequency and duration). The latter is analyzed by GPT. Grades: >=2.0: excellent; 1.5-2.0: good; 1.2-1.5: weak; =<1.2: reject.
- Long OOS (full year before the recent OOS). This phase is used to understand robustness and regime sensitivity: >=1.3: robust; 1.0-1.3: regime-sensitive; =<1.0: fragile. A weak result here does not automatically reject the setup, but it signals higher risk and affects position sizing.
- Stress tests (Covid, Ukraine invasion): the purpose here is survival only. The setup is rejected only if recovery logic breaks and DD becomes unbounded.
- Repeat steps 1-4 every 1-2 months.
Last edited: