System Architecture

High-performance quantitative trading infrastructure powered by Rust & DuckDB

Performance Metrics

~15ms LRU Cache Hit
~30ms Parquet Read
~45ms DuckDB Query
85% Cache Hit Rate
-90% Cost vs TSDB
>1M Rows/sec

Benchmarked on 100GB+ financial time-series data

🏗️ System Overview

Client Layer
Streamlit UIInteractive dashboards
Jupyter NotebooksResearch & analysis
Python CLIScripting & automation
REST APIExternal clients
gRPC / HTTP/2
Service Layer (Rust + gRPC)
HFT Engineport 50052
Data Loaderport 50053
Auth ServiceJWT / RBAC
Data Engine (Polarway + DuckDB)
3-Tier Hybrid Storage
Tier 1: LRU Cache (Hot)
~15ms
Tier 2: Parquet (Warm)
~30ms
Tier 3: DuckDB (Analytics)
~45ms
  • Lazy evaluation (Polars-powered)
  • Query optimizer with pushdown predicates
  • Streaming joins (100M+ rows)
  • Arrow-native zero-copy transfers
  • Full SQL support via DuckDB
Compute Engine (PyO3 + Rust)
Regime HMMDetection
ChiarellaAgent Model
Mean ReversionCARA + Sparse

🔧 Core Components

Built on: 🏔️ Polarway Lakehouse ⚙️ optimiz-rs Rust + PyO3 🔥 HFThot Core Open Source
🏔️

Polarway

Hybrid Storage Engine & Lakehouse
  • ✓ 3-tier architecture (Cache/Parquet/DuckDB)
  • ✓ Railway-oriented error handling
  • ✓ Lazy evaluation via Polars
  • ✓ Delta Lake + ACID transactions
  • ✓ 85% cache hit rate · <30ms
Docs GitHub
⚙️

optimiz-rs

Rust Optimization Library · PyO3
  • ✓ Differential Evolution / SHADE solver
  • ✓ Mean Field Game HJB-FP equations
  • ✓ PyO3 Python bindings (zero-copy)
  • ✓ WASM compilation for browser
  • ✓ 10× faster than scipy.optimize
GitHub API Ref

DuckDB

Analytics Database

  • Embedded columnar database
  • Full SQL support
  • Direct Parquet queries
  • Low-latency analytics (~45ms)
  • Zero operational overhead

Streamlit

Interactive Dashboards

  • Real-time visualizations
  • Python-native components
  • No JavaScript required
  • Rapid prototyping
  • WebSocket updates

🔄 Data Flow Pipeline

Exchange APIsBinance · Coinbase · Kraken
Market Datayfinance · pandas-ta · ccxt
WebSocket StreamReal-time price feeds
Python REST APIHistorical data
gRPC Serviceport 50053
LRU CacheHot · ~15ms
ParquetWarm · ~30ms
DuckDBAnalytics · ~45ms
  • Arrow batches for zero-copy transfers
  • Polars lazy evaluation (query optimization)
  • Automatic tier routing based on data age
  • Streaming aggregations for large datasets

🏗️ HFThot at Scale

End-to-end data flow — from market sources through stream processing to client delivery

HFThot logo HFThot Research
HFThot Technology Stack All components below are orchestrated inside the HFThot platform
market data sources
Binance logo Binance Spot, futures and order-book feeds
Alpaca logo Alpaca Broker APIs and execution endpoints
Finnhub logo Finnhub Market signals, news and reference data
Polymarket logo Polymarket Prediction-market events and microstructure
REST WebSocket Streaming
core ingestion and lakehouse engine
Polarway logo

Central ingestion and hybrid storage layer for HFThot: it normalizes feeds, keeps hot market state in memory and exposes analytics-ready data through Arrow, Polars and Delta Lake semantics.

Arrow IPC Polars lazy Delta Lake Zero-copy
storage compute
DuckDB logo DuckDB

Columnar analytics and time-series querying layer fed by Polarway for fast local research, snapshots and lakehouse-style SQL exploration.

OLAP SQL Timeseries
Optimiz-rs logo Optimiz-rs

Rust optimization engine connected to Polarway for portfolio optimization, stochastic control and mean-field computations through low-overhead data exchange.

Rust PyO3 Optimization

🗄️ Polarway Lakehouse

Infrastructure d'authentification et de gestion de données de nouvelle génération, basée sur Delta Lake pour ACID compliance, time-travel, et traçabilité totale.

Time-Travel

Accédez à n'importe quelle version de vos données utilisateurs, sessions ou API keys. Chaque modification est versionnée et récupérable.

  • read_version(table, version_id)
  • ✓ Audit trail complet automatique
  • ✓ Rollback instantané en cas d'erreur
  • ✓ Conformité réglementaire (FINMA, MiFID II)

RGPD Compliant

Respect total du RGPD avec export, suppression et portabilité des données utilisateurs garanties par Delta Lake.

  • ✓ Export JSON/CSV de toutes vos données
  • ✓ Suppression définitive avec tombstones
  • ✓ Droit à l'oubli (VACUUM)
  • ✓ Traçabilité des accès (audit_log/)

Sécurité & Intégrité

Architecture ACID-compliant garantissant l'intégrité des données même en cas de crash ou d'erreur réseau.

  • ✓ Transactions atomiques (Delta Lake)
  • ✓ Chiffrement Argon2 pour passwords
  • ✓ JWT avec expiration configurable
  • ✓ PyArrow zero-copy pour performance

🐍 LakehouseAuthBackend (Python)

  • register(username, email, password)
  • login(username, password) → JWT token
  • verify_persistent_token(token) → User
  • save_api_keys(user_id, provider_keys, consent)
  • get_api_keys(user_id) → dict[provider: queries_remaining]
  • update_query_count(user_id, provider, increment)

⚙️ LakehouseClient (Rust bindings via PyO3)

  • Zero-copy Arrow transfers
  • PyArrow compute for filtering
  • Lazy evaluation (scan → filter → collect)

📁 Delta Lake Tables (Parquet + _delta_log/)

📁 users/User records (Snappy Parquet) + full version history
📁 sessions/JWT tokens (Snappy Parquet) · 7-day TTL
📁 audit_log/Login/register events · partitioned by date · 90-day retention
📁 api_keys/ ⚡Provider keys + usage · full history
  • ACID transactions (write_deltalake atomic commits)
  • Time-travel queries (restore any version)
  • Snappy compression (10:1 ratio)
  • S3/GCS compatible (future cloud deployment)
95MBStorage
CHF 0.00Cost
4 Tablesusers · sessions · audit · keys
100%Query transparency

📊 Transparence API

Tracking en temps réel de vos queries_done et queries_remaining pour chaque provider (Finnhub, Alpha Vantage). Alertes automatiques avant limite.

🔄 Data Sharing

Programme communautaire: partagez vos données de marché (anonymes) = accès GRATUIT aux APIs premium. 100% RGPD compliant avec opt-out instantané.

⏱️ Audit Logs

Chaque action utilisateur loggée dans audit_log/ avec timestamp, IP, action type. Conformité MiFID II/FINMA garantie.

📚 Documentation Technique

Le module Polarway Lakehouse est documenté sur ReadTheDocs avec exemples d'utilisation, API reference, et best practices.

📖 Polarway Lakehouse Documentation 💳 Pricing & API Key Plans
Exemple d'utilisation:
from python.lakehouse.client import LakehouseClient

# Initialize client with Delta Lake storage
client = LakehouseClient("/app/data/lakehouse")

# API Key Management
keys = client.get_api_keys(user_id="...")
print(f"Finnhub: {keys['provider_keys']['finnhub']['queries_remaining']}/3600")

# Time-Travel: Access historical data
users_v1 = client.read_version("users", version=1)

# RGPD Export: Get all user data
audit_trail = client.billing_summary(user_id, "2026-01-01", "2026-12-31")

💡 Architecture Benefits

Performance

  • Zero-copy data transfers via Arrow
  • SIMD-accelerated operations
  • Parallel execution (multi-core)
  • Lazy evaluation & query optimization
  • 85% cache hit rate reduces I/O

Cost Efficiency

  • 90% cost reduction vs traditional TSDB
  • No database licensing fees
  • Embedded DuckDB (no separate server)
  • Parquet compression (10:1 ratio)
  • Single server deployment

Reliability

  • Rust's memory safety (no crashes)
  • Railway-oriented error handling
  • Type-safe gRPC contracts
  • Immutable Parquet files
  • Transactional DuckDB writes

📚 Resources

Documentation

Complete reference docs, API guide, and lakehouse technical specs

Polarway Docs (ReadTheDocs) → Developer API Reference → Lakehouse Technical Spec →

Benchmarks

Performance comparisons and metrics

View Benchmarks →

Source Code

Open-source repositories on GitHub

hfthot-lab-core → polarway → optimiz-rs →