Prerequisites
Before installing OminiX-MLX, ensure your system meets these requirements:Operating system
macOS 14.0+ (Sonoma) or newer
Hardware
Apple Silicon (M1, M2, M3, or M4 chip)
Rust
Rust 1.82.0 or newer
Development tools
Xcode Command Line Tools
OminiX-MLX requires Apple Silicon to take advantage of Metal GPU acceleration and unified memory. Intel Macs are not supported.
Step 1: Install Xcode Command Line Tools
The Command Line Tools provide essential build utilities and frameworks:Step 2: Install Rust
If you don’t have Rust installed, use rustup:Step 3: Install HuggingFace CLI (optional)
The HuggingFace CLI makes it easy to download pre-trained models:While the HuggingFace CLI is optional, it’s highly recommended for downloading models. Alternatively, you can download models manually from huggingface.co.
Step 4: Clone the repository
Clone the OminiX-MLX repository:Step 5: Build OminiX-MLX
You can build all crates or just specific ones you need.- Build all crates
- Build specific crate
Step 6: Verify installation
Verify your installation by checking the built binaries:Available crates
OminiX-MLX is organized into multiple crates. Here are the main ones:Core libraries
| Crate | Description |
|---|---|
mlx-rs | Safe Rust bindings to Apple’s MLX framework |
mlx-rs-core | Shared inference infrastructure (KV cache, RoPE, attention) |
Language models
| Crate | Models | Sizes |
|---|---|---|
qwen3-mlx | Qwen2, Qwen3, Qwen3-MoE | 0.5B - 235B |
glm4-mlx | GLM-4 | 9B |
glm4-moe-mlx | GLM-4-MoE | 9B (45 experts) |
mixtral-mlx | Mixtral | 8x7B, 8x22B |
mistral-mlx | Mistral | 7B |
minicpm-sala-mlx | MiniCPM-SALA | 9B (1M context) |
Vision-language models
| Crate | Models | Features |
|---|---|---|
moxin-vlm-mlx | Moxin-7B VLM | DINOv2 + SigLIP + Mistral-7B |
Speech recognition
| Crate | Models | Languages |
|---|---|---|
qwen3-asr-mlx | Qwen3-ASR | 30+ languages |
funasr-mlx | Paraformer | Chinese, English |
funasr-nano-mlx | FunASR-Nano | Chinese, English |
Text-to-speech
| Crate | Models | Features |
|---|---|---|
gpt-sovits-mlx | GPT-SoVITS | Few-shot voice cloning |
Image generation
| Crate | Models | Notes |
|---|---|---|
flux-klein-mlx | FLUX.2-klein | Fast 4-step generation |
zimage-mlx | Z-Image | Lightweight generation |
qwen-image-mlx | Qwen Image | Qwen-based generation |
API server
| Crate | Description |
|---|---|
ominix-api | Unified OpenAI-compatible API server for all models |
System requirements by model
Different models have different memory requirements:| Model Size | Recommended RAM | Minimum RAM |
|---|---|---|
| 0.5B - 2B | 8GB | 8GB |
| 4B - 7B | 16GB | 12GB |
| 9B | 24GB | 16GB |
| 30B+ | 64GB+ | 32GB |
| VLM (7B) | 24GB | 16GB |
| ASR | 8GB | 8GB |
| Image Gen | 32GB | 16GB |
Troubleshooting
Build fails with 'mlx-c not found'
Build fails with 'mlx-c not found'
The MLX C bindings are included as a submodule. Make sure you cloned with submodules:
Linker errors about Metal framework
Linker errors about Metal framework
Ensure you have the Xcode Command Line Tools installed:
Out of memory during build
Out of memory during build
Build specific crates instead of all at once:
Slow inference performance
Slow inference performance
Make sure you’re using
--release builds. Debug builds are 10-100x slower:Next steps
Quick start
Run your first model in under 5 minutes
LLM guide
Learn how to use language models