Quant Researcher Lab — HFThot Research Lab

Research Pipeline

From raw market data to a deployed strategy — one Docker stack

1 — Data Ingestion (Polarway)

Yahoo Finance / CCXT feeds → Polars DataFrame → Parquet lake. Point-in-time joins via DuckDB prevent lookahead bias.

2 — Regime Detection (HMM · Rust Baum-Welch)

Gaussian HMM with k=3 states fitted to daily log-returns. Rust Baum-Welch converges in <2ms / 500 obs. Viterbi state sequence decoded in <0.5ms.

3 — Parameter Estimation (MCMC · Rust)

Metropolis-Hastings posterior over regime-conditional means and covariances. 10,000 MCMC samples in <800ms via mcmc_sample Rust binding.

4 — Portfolio Optimisation (CARA · Optimiz-R DE)

CARA utility maximisation per regime. For non-convex parameter spaces, Differential Evolution (Optimiz-R) finds the global optimum without gradient information.

5 — Backtesting (Rust backtest engine)

522-day full backtest with transaction costs (10 bps), slippage model, and rebalancing schedule — all in a single backtest_with_costs_rust call.

6 — Reporting (DuckDB + Streamlit)

Results written to DuckDB leaderboard with time-travel support. Streamlit dashboard renders interactive P&L, regime attribution, and weight charts.

Hidden Markov Model — Regime Detection

Identify Bull / Bear / Sideways regimes in real time with Rust Baum-Welch

The model has hidden states $Q = \{q_1,\ldots,q_K\}$ (market regimes) emitting observable log-returns $O = \{r_1,\ldots,r_T\}$ drawn from regime-conditional Gaussians:

Emission probability

$$P(r_t \mid q_t = k) = \mathcal{N}\!\left(r_t;\;\mu_k,\;\sigma_k^2\right)$$

Forward variable (E-step)

$$\alpha_t(k) = P(r_1,\ldots,r_t,\,q_t=k\mid\lambda)$$ $$\alpha_t(k) = \left[\sum_j \alpha_{t-1}(j)\,a_{jk}\right] b_k(r_t)$$

The Viterbi algorithm recovers the most likely hidden path:

Viterbi recursion

$$\delta_t(k) = \max_{q_{1:t-1}}\,P(q_1,\ldots,q_{t-1},q_t=k,r_{1:t}\mid\lambda)$$

Live fit on SPY/QQQ/IWM/EEM/EFA/GLD/TLT/VNQ (Jan 2024 – Mar 2026):

Detected Regime Sequence — 8-ETF Basket

Regime-conditional daily returns (annualised): Bull +29% Sideways +8% Bear -12%

CARA Optimisation + Differential Evolution

Global optimizer for non-convex utility landscapes — Rust-native DE from Optimiz-R

The CARA (Constant Absolute Risk Aversion) utility yields closed-form weights for a Gaussian regime:

CARA optimal weights (unconstrained)

$$w^* = \frac{1}{\gamma}\,\Sigma_k^{-1}\,\mu_k$$

For constrained problems — long-only, leverage cap, or non-Gaussian regimes — HFThot uses Differential Evolution from the optimiz-r crate. The DE mutation operator:

DE mutation

$$\mathbf{v}_i = \mathbf{x}_{r_1} + F\,(\mathbf{x}_{r_2} - \mathbf{x}_{r_3}), \quad F\in[0,2]$$

DE crossover + selection

$$u_{i,j} = \begin{cases} v_{i,j} & \text{if } U[0,1] < CR \\ x_{i,j} & \text{otherwise} \end{cases}$$ $$\mathbf{x}_i^{t+1} = \begin{cases} \mathbf{u}_i & \text{if } f(\mathbf{u}_i) \leq f(\mathbf{x}_i^t) \\ \mathbf{x}_i^t & \text{otherwise} \end{cases}$$

# advanced_portfolio_optimization.ipynb — HMM + CARA
import hft_lab_core as hft

# ── 1. Fit HMM (Rust Baum-Welch, 3 regimes) ──
hmm_params = hft.fit_hmm(returns.values, n_states=3)
state_seq  = hft.viterbi_decode(returns.values, hmm_params)

# ── 2. Regime-conditional μ / Σ ──────────────
regime_portfolios = {}
for k in range(3):
    mask  = (state_seq == k)
    mu_k  = returns[mask].mean().values * 252   # annualised
    cov_k = returns[mask].cov().values * 252

    # ── CARA closed-form ──────────────────────
    gamma = {"Bear": 5.0, "Sideways": 2.0, "Bull": 1.0}[regime_names[k]]
    w = hft.cara_optimal_weights_rust(mu_k, cov_k, gamma=gamma)

    # ── DE for constrained optimisation ───────
    from optimizr import DifferentialEvolution
    de = DifferentialEvolution(pop_size=60, F=0.8, CR=0.9, max_iter=200)
    w_constrained = de.optimize(
        objective=lambda w: -sharpe(mu_k, cov_k, w),
        bounds=[(−0.3, 1.0)] * len(mu_k),
        constraint=lambda w: np.sum(w) == 1.0
    )
    regime_portfolios[regime_names[k]] = w_constrained

Strategy Leaderboard

522-day backtest · 8 ETFs · 10 bps transaction cost · Rust engine

3.34

Multi-Strategy CARA SR

2.54

Momentum SR

1.40

Adaptive HMM SR

−4.7%

Adaptive Max DD

<2ms

Rust backtest (522d)

Leaderboard — Cumulative Returns by Strategy (Jan 2024 – Mar 2026)

⚡ Rust Performance (vs pure Python)

150×

HMM Baum-Welch

200×

CARA weight solve

80×

Full backtest loop

Research Stack

Every component is open-source or in the HFThot toolkit

🦀

hfthot-lab-core

Rust crate — HMM, CARA, backtest, MCMC, cointegration, signature features

⚙️

optimiz-r

Rust Differential Evolution with PyO3 bindings — global portfolio optimisation

🐻‍❄️

Polarway

Polars-native ETL pipeline — Parquet lake, DuckDB queries, no Pandas overhead

📓

Jupyter Notebooks

Reproducible research — kernel: rhftlab (Python 3.11 + all deps)

🗄️

DuckDB Lakehouse

Strategy leaderboard with time-travel — no lookahead in point-in-time analytics

🐳

Docker Stack

Single docker-compose up — Jupyter + Streamlit + DuckDB + Rust ready

References

[1] Baum, Petrie, Soules & Weiss (1970) — A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics.
[2] Rabiner (1989) — A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2).
[3] Storn & Price (1997) — Differential Evolution — a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization.
[4] Merton (1971) — Optimum consumption and portfolio rules in a continuous-time model. Journal of Economic Theory 3(4).
[5] Pola (2016) — A Higher Criticism approach to detecting regime changes. Journal of Finance and Data Science.

ArXiv → Live Backtestin Minutes

⚡ Rust Performance (vs pure Python)

ArXiv → Live Backtest
in Minutes