Architectural pillars
The system is organized around four core principles:- Ultra-realistic market simulation with microstructure awareness
- ML-ready interfaces for seamless integration with Python ML frameworks
- Built-in statistical robustness checks and validation
- Performance-first design using Rust for the critical path
Core components
GlowBack’s architecture consists of three primary layers:Engine layer (Rust)
The heart of GlowBack is written in Rust for maximum performance and memory safety:- gb-engine: Event-driven backtesting engine with realistic execution simulation
- gb-types: Core domain types (orders, positions, portfolios, strategies)
- gb-data: Data ingestion, storage (Parquet/Arrow), and caching
API layer (Python)
FastAPI-based REST API provides async operations and OpenAPI documentation:- gb-python: PyO3 Python bindings with zero-copy Arrow interop
- Strategy development SDK (
qb.Api) with event-driven API - JupyterLab integration for research workflows
Storage layer
Columnar storage optimized for time-series data:- Parquet files for compressed columnar storage
- Apache Arrow for in-memory zero-copy operations
- DuckDB for local analytics and metadata
- Redis Cluster for hot data caching (millisecond latency)
System architecture diagram
Technology stack
| Layer | Technology | Purpose |
|---|---|---|
| Core Engine | Rust (Arrow + Parquet) | Speed, memory safety |
| API Layer | FastAPI (Python) | Async, OpenAPI docs |
| Worker Orchestration | Ray on Kubernetes | Horizontal scalability |
| Metadata DB | DuckDB / PostgreSQL | Local analytics / shared environments |
| Object Storage | Parquet files | Durable, columnar compression |
| Cache | Redis Cluster | Sub-millisecond read path |
| Frontend | Streamlit (PoC) / React (production) | Local validation / cloud dashboard |
| Python Tooling | uv | Fast, deterministic dependency management |
Data flow
The typical backtest execution follows this sequence:Crate organization
The source code is organized into focused Rust crates:Design principles
Event-driven simulation
GlowBack uses a chronological event-driven architecture rather than vectorized backtesting. This ensures:- Realistic order timing and execution
- No look-ahead bias
- Accurate simulation of market microstructure
Zero-copy data sharing
Apache Arrow enables zero-copy data sharing between:- Rust engine and Python strategies
- Parquet files and in-memory processing
- CPU and GPU (for ML model inference)
Deterministic execution
All backtests are fully reproducible:- Reproducible random seeds
- Results hash stored in metadata
- UTC nanosecond timestamps (no timezone ambiguity)
Scalability
Horizontal scaling
- Stateless API pods for REST/WebSocket endpoints
- Ray workers for distributed parameter optimization
- KEDA autoscaling based on queue depth
Vertical optimization
- SIMD operations via Arrow columnar processing
- Rayon for multi-threaded single-run speed
- Memory-mapped Parquet files for large datasets
Storage footprint
- ≤ 1 TB for 10 years of tick data across 1,000 symbols
- ZSTD compression for Parquet files
- Delta Lake / Iceberg for time-travel and versioning
Deployment models
Local development
Run entirely on your laptop:Cloud deployment
Production deployment uses:- Kubernetes (AKS) for container orchestration
- Azure Blob Storage for data lake
- PostgreSQL for shared metadata
- GitHub Actions for CI/CD
- Argo CD for GitOps deployment
Next steps
Event-driven simulation
Learn how the backtesting engine processes events chronologically
Market data
Understand data ingestion, storage, and caching
Portfolio management
Explore position tracking and P&L calculation
Getting started
Run your first backtest