Implemented in Trade-Matrix

Current Status: MS-GARCH regime detection is production-deployed for real-time inference (January 2026). NOT deployed as standalone strategy.

๐Ÿ“Š See Appendix C: Week 49 Backtest Reports for interactive validation results comparing Regime OFF vs Regime ON performance.

Research Outcome

After comprehensive validation (October-November 2025):

  • Standalone MS-GARCH Sharpe: 0.32-0.81 (after look-ahead bias correction)
  • Deployment threshold: 2.0+ Sharpe (institutional standard)
  • Decision: Do NOT deploy as standalone strategy
  • Existing Trade-Matrix ML/RL system: 2.72 Sharpe (remains primary)

What IS Implemented

4-State Regime Classification (integrated into production adaptive thresholds):

Production Adaptive Thresholds (4-state MS-GARCH):

  • BULL regime โ†’ 0.85x threshold multiplier (relaxed)
  • NEUTRAL regime โ†’ 1.00x threshold multiplier (standard)
  • BEAR regime โ†’ 1.30x threshold multiplier (stricter)
  • HIGH_VOL regime โ†’ 1.50x threshold multiplier (crisis mode)

Backtest Position Sizing (simplified 2-state):

  • Low-Volatility regime โ†’ 1.2x risk multiplier
  • High-Volatility regime โ†’ 0.7x risk multiplier

Regime-Adaptive Kelly (within PURE_KELLY fallback tier):

  • Bear: 25% sizing (gamma=4.0)
  • Neutral: 50% sizing (gamma=2.0)
  • Bull: 67% sizing (gamma=1.5)
  • Crisis: 17% sizing (gamma=6.0)

Research Code Available:

  • research/ms-garch/regime_detector.py - MS-GARCH implementation (EM algorithm, numerical stability)
  • research/ms-garch/hmm_regime_detector.py - HMM for directional regime detection
  • research/ms-garch/model_selection.py - Regime model selection and validation
  • Mathematically validated, production-ready algorithms (research-grade, not deployed as standalone strategy)

Production Code Deployed (January 2026):

  • services/regime_detection/realtime_regime_detector.py - Real-time Hamilton Filter (1,192 LOC)
  • services/regime_detection/precalc_regime_loader.py - Backtest regime loader (352 LOC)
  • scripts/ml/retrain_regime_models.py - Weekly model retraining (717 LOC)
  • tests/unit/test_realtime_regime_detector.py - 19 unit tests
  • tests/integration/test_regime_integration.py - 11 integration tests

What Is NOT Deployed

  • โŒ MS-GARCH as standalone strategy (Sharpe below threshold)
  • โŒ Full volatility forecasting (regime detection only, not VaR prediction)
  • โŒ Cross-asset regime propagation (correlation-based triggers not yet implemented)
  • โŒ Gradient position sizing (step function used, continuous probabilities available)
  • โŒ Prometheus alerts YAML (metrics exposed, alert rules not created)
  • โŒ Dedicated Grafana regime dashboard (planned for Trading-Cockpit)

What IS Deployed (January 2026):

  • โœ… Real-time Hamilton Filter (RealtimeRegimeDetector Actor)
  • โœ… 4-state adaptive thresholds (BULL/NEUTRAL/BEAR/HIGH_VOL)
  • โœ… Regime-adaptive Kelly (17%-67% Kelly fractions)
  • โœ… Redis caching with 3-tier fallback
  • โœ… Weekly retraining in TL workflow (Step 3.7)
  • โœ… 30 tests (unit + integration) all passing

Research & Future Enhancements

This section documents research findings and theoretical extensions not deployed in production.

Key Research Contribution: This research established rigorous validation methodology and documented the 84% Sharpe degradation from look-ahead bias correction - an important finding for practitioners.


Abstract

Market regime detection represents a fundamental challenge in quantitative finance, particularly for cryptocurrency markets characterized by extreme volatility clustering, fat-tailed distributions, and frequent structural breaks. This research presents a comprehensive framework for Hidden Markov Model (HMM) based regime detection using Markov-Switching GARCH (MS-GARCH) models, specifically adapted for cryptocurrency trading environments.

Our research implementation explores four distinct market states: Low-Volatility Bull (optimal leverage 1.5-2.0x), High-Volatility Bull (1.0-1.5x), Low-Volatility Bear (0.5-1.0x), and High-Volatility Bear (0.3-0.7x). The framework integrates with Kelly criterion position sizing, providing regime-adaptive risk multipliers ranging from 25% in Bear markets to 67% in Bull markets, with a 17% emergency allocation during Crisis periods.

Rigorous validation following institutional standards (Renaissance Technologies, Two Sigma, Citadel) revealed critical methodological insights: standalone MS-GARCH strategies achieve Sharpe ratios of 0.32-0.81 after correcting for look-ahead bias and proper signal lag implementation. This performance was below the institutional deployment threshold of 2.0+ Sharpe, leading to the decision NOT to deploy MS-GARCH as a standalone strategy. However, when integrated as a risk adjustment feature within existing ML/RL trading systems, the framework demonstrates potential for +0.1-0.3 Sharpe ratio improvement with minimal implementation risk.

Current Implementation Status: Trade-Matrix production deployment includes regime-adaptive Kelly multipliers (25%/50%/67%/17%) and volatility-based risk multipliers (0.7x/1.2x) within the PURE_KELLY fallback tier. Full 4-state MS-GARCH models are available in research code but not deployed live as primary strategies. The research concludes with pragmatic recommendations for future enhancement.


1. Introduction

1.1 The Imperative for Regime-Aware Modeling

Financial markets are complex adaptive systems that exhibit distinctly different behavioral patterns across time. A "market regime" can be defined as a semi-persistent state characterized by a coherent set of statistical properties, including mean returns, volatility levels, correlation structures, and persistence characteristics. The central thesis underlying regime-switching models is that financial markets do not behave uniformly through time but rather transition between a finite number of underlying states.

Traditional quantitative models often assume statistical stationarity, treating parameters as constant over the entire sample period. This assumption proves particularly untenable for cryptocurrency markets, where the evidence for distinct, persistent regimes is overwhelming. Early academic studies applying standard GARCH models treated cryptocurrencies as merely another volatile asset class. However, as more data accumulated, the necessity for regime-switching frameworks became undeniable.

1.2 Cryptocurrency Market Characteristics

Empirical analysis of cryptocurrency returns reveals several "stylized facts" that invalidate assumptions of constant volatility and normally distributed returns:

Volatility Clustering: One of the most prominent features of cryptocurrency time series is volatility clustering, where periods of high volatility are followed by other periods of high volatility, and vice versa. This temporal dependence in the second moment of returns motivates the entire GARCH family of models, indicating that volatility itself is predictable. The practical implication is that risk arrives in waves, not uniformly.

Leptokurtosis (Fat Tails): Cryptocurrency return distributions are highly leptokurtic, exhibiting "fat tails" and a sharp peak around the mean compared to Gaussian distributions. Extreme price movements, both positive and negative, occur far more frequently than normal distributions would predict. This property has profound consequences for risk management: models assuming normality systematically underestimate catastrophic loss probabilities.

Structural Breaks and Non-Stationarity: The cryptocurrency market is uniquely susceptible to sudden structural breaks triggered by regulatory announcements, technological milestones (Bitcoin halving events), major security breaches, or broader macroeconomic shocks. Such events fundamentally alter the data-generating process, violating parameter constancy assumptions.

1.3 Limitations of Single-Regime Models

Standard GARCH models, while representing significant advancement in capturing volatility clustering, operate under the assumption of a single, stable regime. When applied to time series containing distinct periods of calm and crisis, single-regime models estimate parameters representing an "average" of different states.

Consequently, such models:

  • Overestimate risk during tranquil periods (leading to inefficient capital allocation)
  • Underestimate risk during turbulent periods (leading to catastrophic losses)
  • Produce systematically flawed Value-at-Risk (VaR) and Expected Shortfall (ES) calculations

1.4 Research Objectives

This research addresses three primary objectives:

  1. Theoretical Foundation: Establish rigorous mathematical foundations for HMM-based regime detection, including Baum-Welch algorithm for parameter estimation and Viterbi algorithm for state sequence decoding.

  2. Practical Implementation: Develop production-ready MS-GARCH models for cryptocurrency markets with proper handling of look-ahead bias, signal lag, and transaction costs.

  3. Integration Framework: Design regime-adaptive position sizing that integrates with existing ML/RL trading systems, specifically the Trade-Matrix Kelly criterion framework.


2. Hidden Markov Model Foundations

2.1 Markov Chain Basics

A Markov chain is a stochastic process satisfying the Markov property: the probability distribution of future states depends only on the current state, not on the sequence of events that preceded it. Formally, for a discrete-time process {St}t=0โˆž\{S_t\}_{t=0}^{\infty} taking values in a finite state space S={1,2,โ€ฆ,K}\mathcal{S} = \{1, 2, \ldots, K\}:

P(St+1=jโˆฃSt=i,Stโˆ’1,โ€ฆ,S0)=P(St+1=jโˆฃSt=i)=pijP(S_{t+1} = j | S_t = i, S_{t-1}, \ldots, S_0) = P(S_{t+1} = j | S_t = i) = p_{ij}

This "memoryless" property dramatically simplifies the modeling process while capturing the essential feature of regime persistence.

2.2 Transition Probability Matrix

The dynamics of the Markov chain are completely characterized by the Transition Probability Matrix (TPM), denoted P\mathbf{P}. For a KK-state model:

P=(p11p12โ‹ฏp1Kp21p22โ‹ฏp2Kโ‹ฎโ‹ฎโ‹ฑโ‹ฎpK1pK2โ‹ฏpKK)\mathbf{P} = \begin{pmatrix} p_{11} & p_{12} & \cdots & p_{1K} \\ p_{21} & p_{22} & \cdots & p_{2K} \\ \vdots & \vdots & \ddots & \vdots \\ p_{K1} & p_{K2} & \cdots & p_{KK} \end{pmatrix}

where pij=P(St+1=jโˆฃSt=i)p_{ij} = P(S_{t+1} = j | S_t = i) represents the probability of transitioning from state ii to state jj in one time step.

Constraints:

  • 0โ‰คpijโ‰ค10 \leq p_{ij} \leq 1 for all i,ji, j
  • โˆ‘j=1Kpij=1\sum_{j=1}^{K} p_{ij} = 1 for all ii (rows sum to one)

Expected Regime Duration: The diagonal elements piip_{ii} represent the probability of remaining in state ii. The expected duration in state ii before transitioning follows a geometric distribution:

E[Di]=11โˆ’piiE[D_i] = \frac{1}{1 - p_{ii}}

High diagonal values indicate highly persistent regimes, a common finding in financial markets.

2.3 Hidden States Concept

In a Hidden Markov Model, we observe a sequence of emissions {Yt}t=1T\{Y_t\}_{t=1}^{T} that depend on an underlying, unobserved (hidden) sequence of states {St}t=1T\{S_t\}_{t=1}^{T}. The key insight is that while we cannot directly observe the regime, we can infer it probabilistically from the observed data.

Model Components:

  1. Initial State Distribution ฯ€\boldsymbol{\pi}: Probability of starting in each state

    ฯ€i=P(S1=i),โˆ‘i=1Kฯ€i=1\pi_i = P(S_1 = i), \quad \sum_{i=1}^{K} \pi_i = 1
  2. Transition Probabilities P\mathbf{P}: Probability of state transitions

  3. Emission Probabilities: Distribution of observations given the current state

    f(YtโˆฃSt=k;ฮธk)f(Y_t | S_t = k; \theta_k)

For financial applications, emissions are typically modeled as returns following state-dependent distributions.

2.4 Emission Distributions

In the MS-GARCH framework, the emission distribution is a GARCH process with state-dependent parameters. For a Gaussian emission model:

Ytโˆฃ(St=k)โˆผN(ฮผk,ฯƒk,t2)Y_t | (S_t = k) \sim \mathcal{N}(\mu_k, \sigma_{k,t}^2)

where ฯƒk,t2\sigma_{k,t}^2 follows a GARCH(1,1) process:

ฯƒk,t2=ฯ‰k+ฮฑkฯตtโˆ’12+ฮฒkฯƒk,tโˆ’12\sigma_{k,t}^2 = \omega_k + \alpha_k \epsilon_{t-1}^2 + \beta_k \sigma_{k,t-1}^2

Fat-Tailed Extensions: Given the leptokurtosis of cryptocurrency returns, Student-t and Skewed Student-t distributions provide better fits:

Ytโˆฃ(St=k)โˆผtฮฝk(ฮผk,ฯƒk,t2)Y_t | (S_t = k) \sim t_{\nu_k}(\mu_k, \sigma_{k,t}^2)

where ฮฝk\nu_k is the degrees of freedom parameter controlling tail thickness.

2.5 Three Fundamental Problems

HMM theory addresses three canonical problems:

  1. Evaluation Problem: Given model parameters ฮป=(P,ฯ€,{ฮธk})\lambda = (\mathbf{P}, \boldsymbol{\pi}, \{\theta_k\}) and observations Y=(Y1,โ€ฆ,YT)\mathbf{Y} = (Y_1, \ldots, Y_T), compute the likelihood P(Yโˆฃฮป)P(\mathbf{Y} | \lambda). Solved by the Forward algorithm.

  2. Decoding Problem: Given ฮป\lambda and Y\mathbf{Y}, find the most likely state sequence Sโˆ—=argโกmaxโกSP(SโˆฃY,ฮป)\mathbf{S}^* = \arg\max_{\mathbf{S}} P(\mathbf{S} | \mathbf{Y}, \lambda). Solved by the Viterbi algorithm.

  3. Learning Problem: Given Y\mathbf{Y}, estimate optimal parameters ฮปโˆ—=argโกmaxโกฮปP(Yโˆฃฮป)\lambda^* = \arg\max_{\lambda} P(\mathbf{Y} | \lambda). Solved by the Baum-Welch (EM) algorithm.


3. Baum-Welch Algorithm

3.1 Expectation-Maximization Framework

The Baum-Welch algorithm is a special case of the Expectation-Maximization (EM) algorithm for HMMs. Since the state sequence is hidden, we cannot directly maximize the complete-data likelihood. Instead, we iteratively:

  1. E-Step: Compute expected sufficient statistics using current parameter estimates
  2. M-Step: Update parameters to maximize the expected complete-data log-likelihood

3.2 Forward Algorithm

The forward variable ฮฑt(i)\alpha_t(i) represents the probability of observing the partial sequence (Y1,โ€ฆ,Yt)(Y_1, \ldots, Y_t) and being in state ii at time tt:

ฮฑt(i)=P(Y1,โ€ฆ,Yt,St=iโˆฃฮป)\alpha_t(i) = P(Y_1, \ldots, Y_t, S_t = i | \lambda)

Initialization (t=1)(t = 1):

ฮฑ1(i)=ฯ€iโ‹…f(Y1โˆฃS1=i;ฮธi)\alpha_1(i) = \pi_i \cdot f(Y_1 | S_1 = i; \theta_i)

Recursion (t=2,โ€ฆ,T)(t = 2, \ldots, T):

ฮฑt(j)=[โˆ‘i=1Kฮฑtโˆ’1(i)โ‹…pij]โ‹…f(YtโˆฃSt=j;ฮธj)\alpha_t(j) = \left[\sum_{i=1}^{K} \alpha_{t-1}(i) \cdot p_{ij}\right] \cdot f(Y_t | S_t = j; \theta_j)

Termination:

P(Yโˆฃฮป)=โˆ‘i=1KฮฑT(i)P(\mathbf{Y} | \lambda) = \sum_{i=1}^{K} \alpha_T(i)

Numerical Stability: In practice, forward variables become extremely small for long sequences, causing underflow. We use log-space computation:

logโกฮฑt(j)=logโกf(Ytโˆฃj)+logsumexpi[logโกฮฑtโˆ’1(i)+logโกpij]\log \alpha_t(j) = \log f(Y_t | j) + \text{logsumexp}_i\left[\log \alpha_{t-1}(i) + \log p_{ij}\right]

3.3 Backward Algorithm

The backward variable ฮฒt(i)\beta_t(i) represents the probability of observing the future sequence (Yt+1,โ€ฆ,YT)(Y_{t+1}, \ldots, Y_T) given current state ii:

ฮฒt(i)=P(Yt+1,โ€ฆ,YTโˆฃSt=i,ฮป)\beta_t(i) = P(Y_{t+1}, \ldots, Y_T | S_t = i, \lambda)

Initialization (t=T)(t = T):

ฮฒT(i)=1โˆ€i\beta_T(i) = 1 \quad \forall i

Recursion (t=Tโˆ’1,โ€ฆ,1)(t = T-1, \ldots, 1):

ฮฒt(i)=โˆ‘j=1Kpijโ‹…f(Yt+1โˆฃSt+1=j;ฮธj)โ‹…ฮฒt+1(j)\beta_t(i) = \sum_{j=1}^{K} p_{ij} \cdot f(Y_{t+1} | S_{t+1} = j; \theta_j) \cdot \beta_{t+1}(j)

3.4 Computing Posterior Probabilities

State Occupancy Probability ฮณt(i)\gamma_t(i): Probability of being in state ii at time tt given all observations:

ฮณt(i)=P(St=iโˆฃY,ฮป)=ฮฑt(i)โ‹…ฮฒt(i)โˆ‘j=1Kฮฑt(j)โ‹…ฮฒt(j)\gamma_t(i) = P(S_t = i | \mathbf{Y}, \lambda) = \frac{\alpha_t(i) \cdot \beta_t(i)}{\sum_{j=1}^{K} \alpha_t(j) \cdot \beta_t(j)}

Transition Probability ฮพt(i,j)\xi_t(i, j): Probability of transition from state ii to jj at time tt:

ฮพt(i,j)=P(St=i,St+1=jโˆฃY,ฮป)=ฮฑt(i)โ‹…pijโ‹…f(Yt+1โˆฃj)โ‹…ฮฒt+1(j)โˆ‘k=1KฮฑT(k)\xi_t(i, j) = P(S_t = i, S_{t+1} = j | \mathbf{Y}, \lambda) = \frac{\alpha_t(i) \cdot p_{ij} \cdot f(Y_{t+1} | j) \cdot \beta_{t+1}(j)}{\sum_{k=1}^{K} \alpha_T(k)}

3.5 Parameter Re-estimation

M-Step Update Formulas:

Initial probabilities:

ฯ€^i=ฮณ1(i)\hat{\pi}_i = \gamma_1(i)

Transition probabilities:

p^ij=โˆ‘t=1Tโˆ’1ฮพt(i,j)โˆ‘t=1Tโˆ’1ฮณt(i)\hat{p}_{ij} = \frac{\sum_{t=1}^{T-1} \xi_t(i, j)}{\sum_{t=1}^{T-1} \gamma_t(i)}

Emission parameters (for Gaussian case):

ฮผ^k=โˆ‘t=1Tฮณt(k)โ‹…Ytโˆ‘t=1Tฮณt(k)\hat{\mu}_k = \frac{\sum_{t=1}^{T} \gamma_t(k) \cdot Y_t}{\sum_{t=1}^{T} \gamma_t(k)} ฯƒ^k2=โˆ‘t=1Tฮณt(k)โ‹…(Ytโˆ’ฮผ^k)2โˆ‘t=1Tฮณt(k)\hat{\sigma}_k^2 = \frac{\sum_{t=1}^{T} \gamma_t(k) \cdot (Y_t - \hat{\mu}_k)^2}{\sum_{t=1}^{T} \gamma_t(k)}

3.6 Convergence Properties

The Baum-Welch algorithm guarantees monotonic improvement in likelihood at each iteration:

P(Yโˆฃฮป(n+1))โ‰ฅP(Yโˆฃฮป(n))P(\mathbf{Y} | \lambda^{(n+1)}) \geq P(\mathbf{Y} | \lambda^{(n)})

Convergence Criterion: Stop when relative improvement falls below threshold:

โˆฃL(n+1)โˆ’L(n)โˆฃโˆฃL(n)โˆฃ<ฯต\frac{|L^{(n+1)} - L^{(n)}|}{|L^{(n)}|} < \epsilon

Typical values: ฯตโˆˆ[10โˆ’4,10โˆ’6]\epsilon \in [10^{-4}, 10^{-6}]

Local Optima: EM converges to a local maximum, not necessarily global. Multiple random initializations are essential.


4. Viterbi Algorithm

4.1 Dynamic Programming Solution

The Viterbi algorithm finds the most likely state sequence using dynamic programming. Define:

ฮดt(j)=maxโกs1,โ€ฆ,stโˆ’1P(S1=s1,โ€ฆ,Stโˆ’1=stโˆ’1,St=j,Y1,โ€ฆ,Ytโˆฃฮป)\delta_t(j) = \max_{s_1, \ldots, s_{t-1}} P(S_1 = s_1, \ldots, S_{t-1} = s_{t-1}, S_t = j, Y_1, \ldots, Y_t | \lambda)

This represents the probability of the most likely path ending in state jj at time tt.

4.2 Algorithm Steps

Initialization (t=1)(t = 1):

ฮด1(i)=ฯ€iโ‹…f(Y1โˆฃi)\delta_1(i) = \pi_i \cdot f(Y_1 | i) ฯˆ1(i)=0\psi_1(i) = 0

Recursion (t=2,โ€ฆ,T)(t = 2, \ldots, T):

ฮดt(j)=maxโก1โ‰คiโ‰คK[ฮดtโˆ’1(i)โ‹…pij]โ‹…f(Ytโˆฃj)\delta_t(j) = \max_{1 \leq i \leq K} \left[\delta_{t-1}(i) \cdot p_{ij}\right] \cdot f(Y_t | j) ฯˆt(j)=argโกmaxโก1โ‰คiโ‰คK[ฮดtโˆ’1(i)โ‹…pij]\psi_t(j) = \arg\max_{1 \leq i \leq K} \left[\delta_{t-1}(i) \cdot p_{ij}\right]

Termination:

Pโˆ—=maxโก1โ‰คiโ‰คKฮดT(i)P^* = \max_{1 \leq i \leq K} \delta_T(i) STโˆ—=argโกmaxโก1โ‰คiโ‰คKฮดT(i)S_T^* = \arg\max_{1 \leq i \leq K} \delta_T(i)

Backtracking (t=Tโˆ’1,โ€ฆ,1)(t = T-1, \ldots, 1):

Stโˆ—=ฯˆt+1(St+1โˆ—)S_t^* = \psi_{t+1}(S_{t+1}^*)

4.3 Real-Time Regime Identification

For trading applications, we need real-time regime identification using only information available up to time tt.

Filtered Probabilities P(St=kโˆฃY1,โ€ฆ,Yt)P(S_t = k | Y_1, \ldots, Y_t): These use the forward algorithm only and are safe for trading decisions at time tt.

Smoothed Probabilities P(St=kโˆฃY1,โ€ฆ,YT)P(S_t = k | Y_1, \ldots, Y_T): These use all data and suffer from look-ahead bias. Use only for historical analysis.

Critical Implementation Note: To trade at time tt, we must use filtered probabilities from time tโˆ’1t-1. Using P(StโˆฃY1,โ€ฆ,Yt)P(S_t | Y_1, \ldots, Y_t) to determine position at time tt introduces look-ahead bias since YtY_t includes the return we're trying to predict.


5. MS-GARCH Model

5.1 Standard GARCH Review

The GARCH(1,1) model captures volatility clustering through conditional variance:

rt=ฮผ+ฯตt,ฯตt=ฯƒtzt,ztโˆผN(0,1)r_t = \mu + \epsilon_t, \quad \epsilon_t = \sigma_t z_t, \quad z_t \sim \mathcal{N}(0, 1) ฯƒt2=ฯ‰+ฮฑฯตtโˆ’12+ฮฒฯƒtโˆ’12\sigma_t^2 = \omega + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2

where:

  • ฯ‰>0\omega > 0: Baseline variance
  • ฮฑโ‰ฅ0\alpha \geq 0: ARCH effect (news impact)
  • ฮฒโ‰ฅ0\beta \geq 0: GARCH effect (persistence)
  • Stationarity: ฮฑ+ฮฒ<1\alpha + \beta < 1

Unconditional Variance:

E[ฯƒt2]=ฯ‰1โˆ’ฮฑโˆ’ฮฒE[\sigma_t^2] = \frac{\omega}{1 - \alpha - \beta}

5.2 Markov-Switching Extension

The MS-GARCH model introduces regime-dependent parameters:

ฯƒt2=ฯ‰St+ฮฑStฯตtโˆ’12+ฮฒStฯƒtโˆ’12\sigma_{t}^2 = \omega_{S_t} + \alpha_{S_t} \epsilon_{t-1}^2 + \beta_{S_t} \sigma_{t-1}^2

For a 2-regime model:

  • Regime 1 (Low-Vol): (ฯ‰1,ฮฑ1,ฮฒ1)(\omega_1, \alpha_1, \beta_1) with low unconditional variance
  • Regime 2 (High-Vol): (ฯ‰2,ฮฑ2,ฮฒ2)(\omega_2, \alpha_2, \beta_2) with high unconditional variance

5.3 Asymmetric Extensions

GJR-GARCH (Glosten-Jagannathan-Runkle): Captures leverage effect where negative shocks have larger impact:

ฯƒk,t2=ฯ‰k+(ฮฑk+ฮณkโ‹…1ฯตtโˆ’1<0)โ‹…ฯตtโˆ’12+ฮฒkฯƒk,tโˆ’12\sigma_{k,t}^2 = \omega_k + (\alpha_k + \gamma_k \cdot \mathbf{1}_{\epsilon_{t-1} < 0}) \cdot \epsilon_{t-1}^2 + \beta_k \sigma_{k,t-1}^2

where ฮณk>0\gamma_k > 0 indicates negative shocks increase volatility more than positive shocks.

EGARCH (Nelson): Models log-variance ensuring positivity without parameter constraints:

logโกฯƒk,t2=ฯ‰k+ฮฑkโˆฃztโˆ’1โˆฃ+ฮณkztโˆ’1+ฮฒklogโกฯƒk,tโˆ’12\log \sigma_{k,t}^2 = \omega_k + \alpha_k |z_{t-1}| + \gamma_k z_{t-1} + \beta_k \log \sigma_{k,t-1}^2

5.4 Four-State Regime Interpretation

NOTE: Research model only - NOT deployed in production. Production uses 2-state Low-Vol/High-Vol classification.

Our research implementation explored a 4-state model combining direction and volatility dimensions:

State Description Characteristics Leverage Multiplier (Research)
0 Low-Vol Bull Positive drift, low variance, high persistence 1.5-2.0x
1 High-Vol Bull Positive drift, high variance, reactive 1.0-1.5x
2 Low-Vol Bear Negative drift, low variance, gradual decline 0.5-1.0x
3 High-Vol Bear Negative drift, high variance, crisis mode 0.3-0.7x

Economic Interpretation (research context):

  • State 0: Normal market conditions, steady uptrends, consolidation (aggressive positioning)
  • State 1: News-driven rallies, speculative bubbles, momentum runs (moderate positioning)
  • State 2: Gradual corrections, risk-off periods, sector rotation (defensive positioning)
  • State 3: Market crashes, liquidation cascades, black swan events (emergency positioning)

Production Implementation: Trade-Matrix uses 4-state adaptive thresholds (BULL/NEUTRAL/BEAR/HIGH_VOL) with multipliers 0.85x-1.50x for signal quality thresholds. Backtest uses simplified 2-state (Low-Vol/High-Vol) position sizing multipliers (1.2x/0.7x).

5.5 Path Dependency Challenge

A technical challenge in MS-GARCH estimation arises from path dependency: the conditional variance at time tt depends on the entire history of regime states, leading to KTK^T possible paths. The solution developed by Haas et al. (2004) involves integrating out the previous state's variance, making estimation feasible through the EM algorithm.


6. Hamilton Filter and Kim Smoother

6.1 Hamilton's Filtering Algorithm

The Hamilton filter computes filtered probabilities ฮพtโˆฃt(k)=P(St=kโˆฃIt)\xi_{t|t}^{(k)} = P(S_t = k | \mathcal{I}_t) using only information available up to time tt.

Prediction Step:

ฮพtโˆฃtโˆ’1(j)=โˆ‘i=1Kpijโ‹…ฮพtโˆ’1โˆฃtโˆ’1(i)\xi_{t|t-1}^{(j)} = \sum_{i=1}^{K} p_{ij} \cdot \xi_{t-1|t-1}^{(i)}

Update Step:

ฮพtโˆฃt(j)=f(YtโˆฃSt=j,Itโˆ’1)โ‹…ฮพtโˆฃtโˆ’1(j)โˆ‘k=1Kf(YtโˆฃSt=k,Itโˆ’1)โ‹…ฮพtโˆฃtโˆ’1(k)\xi_{t|t}^{(j)} = \frac{f(Y_t | S_t = j, \mathcal{I}_{t-1}) \cdot \xi_{t|t-1}^{(j)}}{\sum_{k=1}^{K} f(Y_t | S_t = k, \mathcal{I}_{t-1}) \cdot \xi_{t|t-1}^{(k)}}

Key Property: Filtered probabilities are real-time safe. They can be computed as new data arrives without future information.

6.2 Kim's Smoothing Algorithm

The Kim smoother computes smoothed probabilities ฮพtโˆฃT(k)=P(St=kโˆฃIT)\xi_{t|T}^{(k)} = P(S_t = k | \mathcal{I}_T) using all available data.

Backward Recursion (t=Tโˆ’1,โ€ฆ,1)(t = T-1, \ldots, 1):

ฮพtโˆฃT(i)=ฮพtโˆฃt(i)โ‹…โˆ‘j=1Kpijโ‹…ฮพt+1โˆฃT(j)ฮพt+1โˆฃt(j)\xi_{t|T}^{(i)} = \xi_{t|t}^{(i)} \cdot \sum_{j=1}^{K} \frac{p_{ij} \cdot \xi_{t+1|T}^{(j)}}{\xi_{t+1|t}^{(j)}}

Key Property: Smoothed probabilities provide better historical regime classification but introduce look-ahead bias for trading decisions.

6.3 Real-Time vs Retrospective Regime

Property Filtered Probabilities Smoothed Probabilities
Information Set It\mathcal{I}_t (up to tt) IT\mathcal{I}_T (all data)
Look-Ahead Bias No Yes
Trading Use Position at t+1t+1 Historical analysis only
Accuracy Lower Higher
Latency Real-time Post-hoc

Critical Trading Implication: Using smoothed probabilities at time tt to trade at time tt constitutes a serious methodological error that inflates backtest performance by 50-200%.

6.4 Signal Lag Implementation

Correct Implementation:

regime[t] = detect_regime(returns[0:t])  # Includes return[t]
leverage[t+1] = map_leverage(regime[t])  # Apply NEXT period
portfolio_return[t+1] = leverage[t+1] * return[t+1]

Incorrect Implementation (Look-Ahead Bias):

regime[t] = detect_regime(returns[0:t])
leverage[t] = map_leverage(regime[t])  # WRONG: Uses return[t] to trade return[t]
portfolio_return[t] = leverage[t] * return[t]

The impact of this error was quantified in our research: correcting for look-ahead bias reduced Sharpe ratio from 2.41 to 0.81 (66% degradation).


7. Regime-Adaptive Kelly Criterion

7.1 Classical Kelly Formula

The Kelly criterion maximizes the expected logarithm of wealth, providing the optimal fraction of capital to bet:

fโˆ—=pโ‹…bโˆ’qbf^* = \frac{p \cdot b - q}{b}

where:

  • pp = probability of winning
  • q=1โˆ’pq = 1 - p = probability of losing
  • bb = odds (profit per unit bet if winning)

For continuous distributions with normally distributed returns:

fโˆ—=ฮผโˆ’rfฯƒ2f^* = \frac{\mu - r_f}{\sigma^2}

where ฮผ\mu is expected return, rfr_f is risk-free rate, and ฯƒ2\sigma^2 is variance.

7.2 Regime-Dependent Kelly

Regime-switching models enable adaptive Kelly by conditioning on the current state:

fkโˆ—=ฮผkโˆ’rfฯƒk2f_k^* = \frac{\mu_k - r_f}{\sigma_k^2}

where (ฮผk,ฯƒk2)(\mu_k, \sigma_k^2) are the state-specific mean and variance.

Practical Implementation: Full Kelly is often too aggressive. We apply fractional Kelly with regime-dependent scaling:

factual=ฮบโ‹…ฮปkโ‹…fkโˆ—f_{\text{actual}} = \kappa \cdot \lambda_k \cdot f_k^*

where ฮบโˆˆ[0.25,0.5]\kappa \in [0.25, 0.5] is the base Kelly fraction and ฮปk\lambda_k is the regime multiplier.

7.3 Position Sizing Multipliers

Production Implementation (Trade-Matrix regime-adaptive Kelly):

The following regime multipliers are implemented in production within the PURE_KELLY fallback tier:

Regime Kelly Multiplier Position Sizing Gamma Parameter
Bull (Low-Vol) 0.67 (67%) Aggressive 1.5
Neutral/Sideways 0.50 (50%) Moderate 2.0
Bear (Low-Vol) 0.25 (25%) Defensive 4.0
Crisis (High-Vol) 0.17 (17%) Emergency 6.0

Four-Tier Fallback Cascade (Trade-Matrix Implementation):

  1. Tier 1: FULL_RL (100% RL) - High confidence (โ‰ฅ0.50\geq 0.50) + IC (โ‰ฅ0.05\geq 0.05)
  2. Tier 2: BLENDED (50% RL + 50% Kelly) - Medium confidence/IC
  3. Tier 3: PURE_KELLY (100% Kelly with regime multiplier) - Low confidence or IC failure
  4. Tier 4: EMERGENCY_FLAT (0% position) - Circuit breaker OPEN

Note: The regime-adaptive Kelly multipliers above are deployed in production. The 4-state leverage multipliers (0.3x-2.0x) mentioned in Section 5.4 are research values only and NOT deployed.

7.4 Mathematical Justification

The regime-dependent Kelly fraction can be derived from the principle of maximizing expected log-utility across regimes:

maxโกfE[logโกWt+1]=โˆ‘k=1Kฯ€kโ‹…E[logโกWt+1โˆฃSt=k]\max_{f} E[\log W_{t+1}] = \sum_{k=1}^{K} \pi_k \cdot E[\log W_{t+1} | S_t = k]

where ฯ€k\pi_k is the probability of regime kk.

For a portfolio with log returns rtr_t and leverage ff:

E[logโก(1+fโ‹…rt)โˆฃSt=k]โ‰ˆfโ‹…ฮผkโˆ’f2โ‹…ฯƒk22E[\log(1 + f \cdot r_t) | S_t = k] \approx f \cdot \mu_k - \frac{f^2 \cdot \sigma_k^2}{2}

Maximizing with respect to ff yields:

fkโˆ—=ฮผkโˆ’rfฯƒk2f_k^* = \frac{\mu_k - r_f}{\sigma_k^2}

The regime multipliers effectively implement this by scaling the base Kelly fraction according to the expected risk-adjusted returns in each state.


8. Research Findings

8.1 Regime Persistence Analysis

Weekly Data (7-day bars):

  • Average regime duration: 21 days (3 weeks)
  • Regime switches per year: ~52 (weekly rebalancing)
  • Transaction cost impact: 4.5-13.4% annual drag

Daily Data (1-day bars):

  • Average regime duration: 3.26 days
  • Regime switches per year: ~112 (bi-weekly rebalancing)
  • Transaction cost impact: 4.5-13.4% annual drag

Key Finding: Weekly frequency produces more persistent regimes but suffers from excessive signal lag (7 days) that destroys alpha in fast-moving cryptocurrency markets. Daily frequency provides better balance between regime persistence and execution responsiveness.

8.2 Crisis Detection Accuracy

The MS-GARCH model successfully identified major volatility events:

  • Luna/UST Collapse (May 2022): High-Vol regime probability >95% within 24 hours
  • FTX Collapse (November 2022): High-Vol regime probability >90% at onset
  • Banking Crisis (March 2023): High-Vol regime correctly identified
  • 2024 Spot ETF Approval: Rapid regime transition captured

High-Vol Regime Frequency: 10-12% of total observations, matching empirical crisis frequency in cryptocurrency markets.

8.3 Look-Ahead Bias Lessons

Our research uncovered significant methodological issues that inflated initial performance claims:

Claimed Performance (Before Corrections):

  • Sharpe Ratio: 5.01
  • Annual Return: 438.3%
  • Max Drawdown: -56.8%

Corrected Performance (After Bias Removal):

  • Sharpe Ratio: 0.81 (weekly) / 0.32 (daily)
  • Annual Return: 67.8% / 43%
  • Max Drawdown: -89.6%

Degradation Waterfall:

Stage Sharpe Cumulative Degradation Issue Fixed
Reported 5.01 - None (flawed baseline)
Fix #1: Sharpe Calculation 2.34 -53% Geometric to Arithmetic
Fix #2: Signal Lag 0.81 -84% Contemporaneous to Lagged
Fix #3: Funding Rates ~0.60 -88% Add 20% annual drag
Fix #4: Walk-Forward OOS ~0.45 -91% IS to OOS degradation

Lesson: When results exceed academic benchmarks by 3-5x, assume error until proven otherwise. 5.01 Sharpe on a simple two-model framework should have triggered immediate skepticism.

8.4 Standalone Performance Assessment

After all corrections, standalone MS-GARCH strategy performance:

Metric Weekly Daily
Sharpe Ratio 0.81 0.32-0.75 (varies by validation window)
Annual Return 67.8% 43-60%
Max Drawdown -89.6% -41%
Win Rate 52% 51%
Profit Factor 1.15 1.08

Verdict: Standalone MS-GARCH fails to meet minimum institutional threshold (Sharpe > 2.0) for primary strategy deployment. However, the regime detection itself is valid and valuable as a risk management input.


9. Trade-Matrix Integration Status

Current Implementation Status (Updated January 2026):

IMPLEMENTED in Production:

  • 4-State Adaptive Thresholds: BULL (0.85x) / NEUTRAL (1.00x) / BEAR (1.30x) / HIGH_VOL (1.50x) multipliers for signal quality gates (services/ml_inference/adaptive_thresholds.py:80-113, config/portfolio_trading_main.yaml:416)
  • Regime-Adaptive Kelly: Within PURE_KELLY fallback tier (Bear 25%, Neutral 50%, Bull 67%, Crisis 17%) in services/rl_agent/kelly_baseline.py:175-201
  • Market Regime Enum: 4-state MarketRegime (BULL/NEUTRAL/BEAR/HIGH_VOL) in institutional rank normalizer

Research Code Available (deployed for backtesting):

  • Full MS-GARCH implementation in research/ms-garch/ (4-state model with EM, Hamilton filter, Kim smoother)
  • HMM directional regime detector for Bull/Sideways/Bear classification
  • Pre-calculated regime loader for backtest (services/regime_detection/precalc_regime_loader.py:35) with 2-state multipliers (1.2x Low-Vol / 0.7x High-Vol)

NOT Deployed as Standalone Strategy:

  • Full MS-GARCH volatility forecasting as primary strategy (Sharpe 0.32-0.81 < 2.0 threshold)
  • Regime prediction (only filtering/classification currently used)

Week 49 Validation Results (December 2025)

Production deployment with enable_regime_adaptation: true was validated in comprehensive backtesting:

Metric Without Regime With Regime Change
Sharpe Ratio 0.3336 0.3491 +4.6%
Profit Factor 1.2985 1.3170 +1.4%

Per-Instrument Impact:

  • BTCUSDT: โˆ’726Kโ†’-726K โ†’ -549K (+24.4% loss reduction)
  • ETHUSDT: 462Kโ†’462K โ†’ 609K (+31.8% profit increase)
  • SOLUSDT: 1.30Mโ†’1.30M โ†’ 1.26M (-3.1% degraded)

Verdict: Marginal but positive improvement. Regime detection provides value through risk mitigation (reducing BTC losses, enhancing ETH profits) despite minor SOL degradation. The 4-state adaptive threshold system successfully reduces exposure during crisis periods while maintaining signal quality.

Complete Portfolio Comparison

Metric WITHOUT Regime WITH Regime ฮ” Change
Total PnL $1,033,376.92 $1,316,700.82 +$283,323.89 (+27.4%)
Total Return 206.68% 263.34% +56.66% absolute
Sharpe Ratio 0.3336 0.3491 +4.6%
Profit Factor 1.2985 1.3170 +1.4%
Sortino Ratio 0.6706 0.7143 +6.5%
Expectancy $707.80/trade $951.08/trade +34.4%
Win Rate 46.67% 46.77% +0.1%

Per-Instrument Impact Analysis

Instrument WITHOUT Regime WITH Regime ฮ” Amount Impact
BTCUSDT -$726,565.06 -$549,459.73 +$177,105 Loss reduction 24.4%
ETHUSDT +$461,794.12 +$608,858.24 +$147,064 Profit increase 31.8%
SOLUSDT +$1,298,678.67 +$1,257,931.20 -$40,747 Slight degradation 3.1%

Key Insight: Regime adaptation primarily helps by reducing losses in unfavorable conditions (BTC -24.4% loss reduction) and amplifying gains in favorable conditions (ETH +31.8%). The slight SOL degradation (-3.1%) is acceptable given the net portfolio improvement of +$283K.

9.1 Integration Architecture (Production Deployed)

MS-GARCH is integrated as a risk adjustment feature across two systems:

Trade-Matrix System (2.72 Sharpe):
  ML Models (OHLCV, sentiment, on-chain) -> Signals
  RL Agent -> Position Sizing

PRODUCTION: 4-State Adaptive Thresholds
  MS-GARCH Detection -> BULL (0.85x) / NEUTRAL (1.00x) / BEAR (1.30x) / HIGH_VOL (1.50x)
  Applied to signal quality gates in ML inference :memo[**Important**: These threshold multipliers control signal QUALITY gates, not position sizes. Position sizing via regime only occurs at Tier 3 (PURE_KELLY fallback).]
  Code: services/ml_inference/adaptive_thresholds.py:80-179
  Config: enable_regime_adaptation: true

PRODUCTION: Regime-Adaptive Kelly
  Regime Detection -> Bear (25%) / Neutral (50%) / Bull (67%) / Crisis (17%)
  Applied within PURE_KELLY fallback tier
  Code: services/rl_agent/kelly_baseline.py:175-201

BACKTEST: 2-State Position Sizing (Research)
  Volatility Detection -> Low-Vol (1.2x) / High-Vol (0.7x) multipliers
  Code: services/regime_detection/precalc_regime_loader.py:35

Production Signal Flow:

ML Prediction โ†’ InstitutionalRankNormalizer โ†’ 4-State Adaptive Thresholds
                (enable_regime_adaptation)     (BULL 0.85x / NEUTRAL 1.0x /
                                                BEAR 1.30x / HIGH_VOL 1.50x)
                                                        โ†“
                                              4-Tier RL Fallback System
                                              โ”œโ”€ Tier 1: FULL_RL (ICโ‰ฅ0.05)
                                              โ”œโ”€ Tier 2: BLENDED (50/50)
                                              โ”œโ”€ Tier 3: PURE_KELLY (regime-adaptive)
                                              โ””โ”€ Tier 4: EMERGENCY_FLAT
                                                        โ†“
                                              Final Position Size

Current Production Status:

  • 4-state adaptive thresholds: DEPLOYED (Week 49 validation: +4.6% Sharpe)
  • Regime-adaptive Kelly multipliers: DEPLOYED (25%-67% sizing)
  • Adaptive threshold system: ACTIVE (services/ml_inference/adaptive_thresholds.py)
  • Full MS-GARCH as primary strategy: Not deployed (Sharpe 0.32-0.81 < 2.0 threshold)

9.2 Why 4-State Production Outperforms 2-State Backtest

Trade-Matrix uses two complementary regime systems:

Aspect 2-State (Backtest) 4-State (Production)
Classification Low-Vol / High-Vol BULL / NEUTRAL / BEAR / HIGH_VOL
Granularity Binary Quadrant (direction ร— volatility)
What it controls Position sizing (0.7x/1.2x) Signal quality thresholds
When applied After signal generation Before signal quality gates

Why 4-State is Superior:

  1. Distinguishes BEAR from CRISIS: 2-state treats all high volatility equally. 4-state applies 1.30x stricter thresholds in BEAR vs 1.50x in HIGH_VOL crisis mode.

  2. Controls Signal QUALITY, Not Just Size: 2-state says "take smaller positions." 4-state says "only accept high-quality signals." Bad signals get REJECTED, not just sized down.

  3. Adaptive IC Thresholds: In HIGH_VOL regime, required IC increases by 50%. This prevents trading during flash crashes where signals are unreliable.

  4. Better Tail Risk Management: HIGH_VOL sets hit_rate_adjustment: 1.20, requiring 20% higher hit rate to pass quality gates.

9.4 Implementation Approach

Module Design:

class MSGARCHRiskAdjuster:
    """
    Daily MS-GARCH regime detection for risk budget adjustment.
    Integrates with existing Trade-Matrix ML/RL system.
    """

    def get_risk_multiplier(self, asset: str) -> float:
        """
        Returns risk multiplier based on current regime.

        Low-Vol regime -> 1.2x multiplier (can take more risk)
        High-Vol regime -> 0.7x multiplier (reduce risk)
        """
        regime = self.get_current_regime(asset)

        if regime == 'Low-Vol':
            return 1.2
        else:  # High-Vol
            return 0.7

Position Sizing Integration:

class PositionSizingAgent:
    def __init__(self):
        self.msgarch = MSGARCHRiskAdjuster()

    def calculate_position(self, signal, confidence):
        # Existing logic
        base_position = self.rl_model.predict(signal, confidence)

        # NEW: Apply regime multiplier
        regime_multiplier = self.msgarch.get_risk_multiplier(signal.asset)
        final_position = base_position * regime_multiplier

        return final_position

9.5 Expected Impact

Baseline Performance:

  • Trade-Matrix: 2.72 Sharpe (2023-2025 validated)
  • MS-GARCH standalone: 0.32-0.81 Sharpe

With Integration:

  • Expected improvement: +0.1-0.3 Sharpe
  • Target range: 2.8-3.0 Sharpe
  • Confidence level: 85%

For $50M AUM:

  • Existing: 50M\*15050M \* 150% annual (2.72 Sharpe conservative est) = 75M profit
  • Enhanced: 50M\*16050M \* 160% annual (3.0 Sharpe) = 80M profit
  • Incremental: +$5M annually from MS-GARCH enhancement

9.6 Implementation Timeline

Week 1-2: Core Implementation (10 days)

  1. Add regime detection module (2 days)
  2. Integrate with RL agent (3 days)
  3. Unit testing and validation (2 days)
  4. Integration testing (3 days)

Week 3-4: Sandbox Validation (10 days) 5. Deploy to sandbox with paper trading 6. Monitor regime detections vs actual market 7. Measure: Does it improve Sharpe? Reduce DD? 8. Collect performance data

Month 2: Production Deployment 9. If successful (Sharpe improvement >0.1): Deploy to live at 10% 10. If unsuccessful: Remove feature, research concluded

9.7 Risk Assessment

Low Risk Factors:

  • Does not replace existing system
  • Can be enabled/disabled easily
  • Minimal code changes required
  • Reversible enhancement

Medium Risk Factors:

  • Regime multipliers may need tuning
  • Daily lag may miss intraday regime shifts
  • Additional computational overhead

Mitigation:

  • A/B testing during sandbox period
  • Conservative initial multipliers (0.9x High-Vol, 1.1x Low-Vol)
  • Real-time monitoring dashboard

10. Conclusion

10.1 Summary of Findings

This research provides a comprehensive framework for Hidden Markov Model based regime detection in cryptocurrency markets. Key findings include:

  1. Theoretical Foundation: The MS-GARCH model provides a rigorous, interpretable framework for capturing volatility regime switching in cryptocurrency markets.

  2. Methodological Rigor: Proper implementation requires careful attention to look-ahead bias, signal lag, and transaction costs. Initial backtest results can be inflated by 80-90% without these corrections.

  3. Standalone Limitations: MS-GARCH as a standalone strategy achieves Sharpe ratios of 0.32-0.81 after corrections, below institutional deployment thresholds (2.0+). Result: NOT deployed as standalone strategy.

  4. Integration Value: As a risk adjustment feature within existing ML/RL systems, MS-GARCH provides meaningful enhancement (+0.1-0.3 Sharpe) with minimal implementation risk. Current production implementation: Regime-adaptive Kelly multipliers (25%-67%) and 2-state volatility multipliers (0.7x/1.2x) are deployed. Full 4-state MS-GARCH available in research code but not live.

  5. Implementation Maturity:

    • Research code: Full 4-state MS-GARCH with EM algorithm, Hamilton filter, Kim smoother (available)
    • Production: Simplified 2-state regime classification (deployed)
    • Code location: research/ms-garch/ for research, services/ml_inference/adaptive_thresholds.py for production
    • Status: Regime filtering/classification in use; regime prediction not yet deployed

10.2 Academic Contributions

This research contributes to the literature on:

  • Data Leakage Prevention: Documented 84% Sharpe degradation from look-ahead bias, reinforcing the importance of proper backtest methodology.
  • Cryptocurrency-Specific Challenges: Demonstrated that crypto regime durations (3-21 days) are significantly shorter than equity market regimes (months) due to 3-4x higher volatility.
  • Practical Implementation: Provided production-ready code patterns for Hamilton filter, Kim smoother, and regime-adaptive position sizing.

10.3 Future Research Directions

  1. Multi-Asset Regime Models: Extend to capture cross-asset regime correlations and volatility spillovers.
  2. High-Frequency Extensions: Incorporate intraday data for microstructural regime detection.
  3. Neural Network Integration: Use neural networks to model transition probabilities or emission distributions.
  4. Time-Varying Transition Matrices: Allow regime persistence to vary with market conditions.

This article is part of the MS-GARCH research series for Trade-Matrix:

MS-GARCH Notebook Series

# Article Focus Key Finding
1 MS-GARCH Data Exploration Data Understanding Cryptocurrency volatility patterns, CRISP-DM methodology
2 MS-GARCH Model Development Model Fitting 2-regime GJR-GARCH optimal, BTC 77% Low-Vol / 23% High-Vol
3 MS-GARCH Backtesting Economic Validation Sharpe 1.69 with regime-conditional leverage
4 MS-GARCH Weekly Optimization Frequency Analysis 8.38ร— longer regime durations with weekly data

Trade-Matrix Integration Reference

For production implementation details, see: MS-GARCH Trade-Matrix Reference


References

Academic Literature

Ang, A., & Bekaert, G. (2002). International asset allocation with regime shifts. Review of Financial Studies, 15(4), 1137-1187.

Ardia, D., Bluteau, K., Boudt, K., & Catania, L. (2019). Forecasting risk with Markov-switching GARCH models: A large-scale performance study. International Journal of Forecasting, 35(2), 733-747.

Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics, 41(1), 164-171.

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307-327.

Caporale, G. M., & Zekokh, T. (2019). Modelling volatility of cryptocurrencies using Markov-Switching GARCH models. Research in International Business and Finance, 48, 143-155.

Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50(4), 987-1007.

Glosten, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the relation between the expected value and the volatility of the nominal excess return on stocks. Journal of Finance, 48(5), 1779-1801.

Guidolin, M., & Timmermann, A. (2008). Asset allocation under multivariate regime switching. Journal of Economic Dynamics and Control, 32(11), 3503-3544.

Haas, M., Mittnik, S., & Paolella, M. S. (2004). A new approach to Markov-switching GARCH models. Journal of Financial Econometrics, 2(4), 493-530.

Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57(2), 357-384.

Katsiampa, P. (2019). Volatility co-movement between Bitcoin and Ether. Finance Research Letters, 30, 221-227.

Kelly, J. L. (1956). A new interpretation of information rate. Bell System Technical Journal, 35(4), 917-926.

Kim, C. J. (1994). Dynamic linear models with Markov-switching. Journal of Econometrics, 60(1-2), 1-22.

Koch, S., & Fengler, M. R. (2024). Modelling and forecasting Bitcoin realized volatility: Stochastic autoregressive volatility vs. MSGARCH. Journal of Forecasting, 43(1), 112-131.

Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley.

Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59(2), 347-370.

Sharpe, W. F. (1966). Mutual fund performance. Journal of Business, 39(1), 119-138.

Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2), 260-269.

Trade-Matrix Research Documents

Trade-Matrix Labs. (2025). Critical Findings: Honest Assessment of MS-GARCH Weekly Research. Internal Research Report.

Trade-Matrix Labs. (2025). Final MS-GARCH Research Recommendation. Internal Research Report.

Trade-Matrix Labs. (2025). MS-GARCH Project Executive Summary. Internal Research Report.

Trade-Matrix Labs. (2025). Regime Detector Implementation Guide. Internal Technical Documentation.


Appendix A: Implementation Code Patterns

A.1 HMM Regime Detector Class

class HMMRegimeDetector:
    """
    Hidden Markov Model for directional regime detection in financial returns.

    Identifies Bull/Sideways/Bear market regimes based on return distribution
    characteristics (mean and variance) using Gaussian emissions.
    """

    def __init__(
        self,
        n_regimes: int = 3,
        covariance_type: str = 'diag',
        n_iter: int = 100,
        tol: float = 1e-3,
        random_state: int = 42
    ):
        self.n_regimes = n_regimes
        self.model = hmm.GaussianHMM(
            n_components=n_regimes,
            covariance_type=covariance_type,
            n_iter=n_iter,
            tol=tol,
            random_state=random_state
        )

    def fit(self, returns: pd.Series) -> 'HMMRegimeDetector':
        """Fit HMM using Baum-Welch algorithm."""
        X = returns.values.reshape(-1, 1)
        self.model.fit(X)
        return self

    def get_filtered_probabilities(self) -> pd.DataFrame:
        """
        Filtered probabilities: P(S_t | data up to t)
        SAFE for real-time trading (no look-ahead bias)
        """
        X = self.returns_.values.reshape(-1, 1)
        probs = self.model.predict_proba(X)
        return pd.DataFrame(probs, index=self.returns_.index)

    def get_most_likely_sequence(self) -> pd.Series:
        """
        Viterbi algorithm for most likely state sequence.
        WARNING: Uses entire sequence, not for real-time trading.
        """
        X = self.returns_.values.reshape(-1, 1)
        states = self.model.predict(X)
        return pd.Series(states, index=self.returns_.index)

A.2 MS-GARCH Detector Class

class MSGARCHDetector:
    """
    Markov-Switching GARCH Regime Detector.

    Implements EM algorithm for estimating MS-GARCH models with:
    - Multiple GARCH specifications (sGARCH, GJR-GARCH, eGARCH)
    - Fat-tailed distributions (Normal, Student-t, Skewed-t)
    - Hamilton filter for real-time regime inference
    - Kim smoother for historical analysis
    """

    def __init__(
        self,
        n_regimes: int = 4,
        garch_type: str = 'gjrGARCH',
        distribution: str = 'sstd',
        max_iter: int = 100,
        n_starts: int = 5
    ):
        self.n_regimes = n_regimes
        self.garch_type = garch_type
        self.distribution = distribution
        self.max_iter = max_iter
        self.n_starts = n_starts

    def fit(self, returns: np.ndarray) -> 'MSGARCHDetector':
        """Fit MS-GARCH using EM with multiple random starts."""
        best_ll = -np.inf
        for start in range(self.n_starts):
            result = self._fit_single_start(returns, start)
            if result['log_likelihood'] > best_ll:
                best_ll = result['log_likelihood']
                self.params_ = result['params']
                self.filtered_probs_ = result['filtered_probs']
        return self

    def get_current_regime(self, use_smoothed: bool = False) -> int:
        """Get most likely current regime."""
        probs = self.smoothed_probs_ if use_smoothed else self.filtered_probs_
        return np.argmax(probs[-1])

A.3 Risk Adjustment Integration

class RegimeRiskAdjuster:
    """
    Integrate MS-GARCH regime detection with position sizing.
    """

    def __init__(self, detector: MSGARCHDetector):
        self.detector = detector
        self.multipliers = {
            0: 1.2,   # Low-Vol Bull
            1: 1.0,   # High-Vol Bull
            2: 0.7,   # Low-Vol Bear
            3: 0.5    # High-Vol Bear / Crisis
        }

    def get_risk_multiplier(self) -> float:
        """Get regime-based risk multiplier for position sizing."""
        current_regime = self.detector.get_current_regime()
        return self.multipliers.get(current_regime, 1.0)

    def adjust_position(self, base_position: float) -> float:
        """Apply regime multiplier to base position size."""
        multiplier = self.get_risk_multiplier()
        return base_position * multiplier

Appendix B: Validation Checklist

B.1 Pre-Deployment Validation

  • Signal lag properly implemented (t-1 regime for t position)
  • No look-ahead bias in filtered probabilities
  • Transaction costs included (funding rates, slippage)
  • Walk-forward validation completed (minimum 25 windows)
  • Out-of-sample Sharpe > 1.5 threshold met (pilot/sandbox deployment)
  • Out-of-sample Sharpe > 2.0 threshold met (full production deployment)
  • Maximum drawdown within tolerance (-40%)
  • Regime durations > 3 days average

B.2 Production Monitoring

  • Real-time regime probability tracking
  • Regime transition alerts
  • Performance attribution by regime
  • Circuit breaker integration
  • A/B testing vs baseline system
  • Model degradation detection

B.3 Risk Management Integration

  • Kelly multipliers calibrated
  • Emergency flat trigger tested
  • Correlation with existing risk metrics validated
  • Backtesting on crisis periods (Luna, FTX)
  • Rollback procedure documented

Appendix C: Week 49 Backtest Reports

Interactive backtest reports comparing regime detection impact on trading performance. These reports demonstrate the +4.6% Sharpe improvement and +$283K PnL gain achieved through 4-state regime adaptation.

C.1 Baseline: Regime OFF

Performance without regime-adaptive thresholds (2-state backtest sizing only):

๐Ÿ“Š Baseline Backtest (Regime OFF)

Week 49 validation โ€ข Sharpe: 0.3336 โ€ข PnL: $1,033,377

Open Full Report โ†—
Baseline performance without 4-state adaptive thresholds โ€ข Total Return: 206.68%

C.2 Enhanced: Regime ON

Performance with 4-state regime-adaptive thresholds enabled:

๐Ÿ“ˆ Enhanced Backtest (Regime ON)

Week 49 validation โ€ข Sharpe: 0.3491 (+4.6%) โ€ข PnL: $1,316,701 (+27.4%)

Open Full Report โ†—
+$283,324 improvement โ€ข BTC: +24.4% loss reduction โ€ข ETH: +31.8% profit increase

C.3 Comparison Summary

Metric Regime OFF Regime ON ฮ” Change
Total PnL $1,033,377 $1,316,701 +$283,324 (+27.4%)
Sharpe Ratio 0.3336 0.3491 +4.6%
Sortino Ratio 0.6706 0.7143 +6.5%
Profit Factor 1.2985 1.3170 +1.4%
Win Rate 46.67% 46.77% +0.1%