Thank you for your interest in contributing to Voxtype! This guide will help you get started.
Ways to Contribute
- Report bugs - Open an issue describing the bug
- Request features - Open an issue describing the feature
- Submit code - Fork, make changes, and submit a pull request
- Improve documentation - Help make the docs clearer
- Help others - Answer questions in discussions
Development Setup
Prerequisites
- Rust (stable, 1.70+)
- Linux with Wayland or X11
- Build dependencies:
Arch Linux
Ubuntu/Debian
Fedora
sudo pacman -S base-devel clang alsa-lib
sudo apt install build-essential libclang-dev libasound2-dev
sudo dnf install @development-tools clang-devel alsa-lib-devel
Building from Source
# Clone the repository
git clone https://github.com/peteonrails/voxtype
cd voxtype
# Build debug version (faster compilation)
cargo build
# Build release version
cargo build --release
# Run tests
cargo test
# Run with verbose output
cargo run -- -vv
GPU Acceleration (Optional)
For faster transcription, build with GPU support:
# Install Vulkan runtime
# Arch:
sudo pacman -S vulkan-icd-loader
# Ubuntu:
sudo apt install libvulkan1
# Fedora:
sudo dnf install vulkan-loader
# Build with Vulkan support
cargo build --release --features gpu-vulkan
# Install CUDA toolkit first, then:
cargo build --release --features gpu-cuda
# Install ROCm, then:
cargo build --release --features gpu-hipblas
Code Structure
Voxtype follows a modular, trait-based architecture:
Core Traits
HotkeyListener - Hotkey detection (evdev, compositor bindings)
AudioCapture - Audio input (cpal, PipeWire, PulseAudio)
Transcriber - Speech-to-text (Whisper, ONNX engines, remote API)
TextOutput - Text delivery (wtype, dotool, ydotool, clipboard)
Each trait has multiple implementations, making the system easily extensible.
Key Components
| Component | Purpose |
|---|
src/daemon.rs | Main event loop with async task coordination |
src/state.rs | State machine: Idle → Recording → Transcribing → Outputting |
src/config.rs | Configuration parsing and validation |
src/hotkey/ | Keyboard input detection |
src/audio/ | Audio capture and playback |
src/transcribe/ | Transcription backends |
src/output/ | Text output drivers |
src/text/ | Text processing (replacements, punctuation) |
See the Architecture page for details.
Development Workflow
Running Tests
# Run all tests
cargo test
# Run specific test
cargo test test_name
# Run with output visible
cargo test -- --nocapture
# Test a specific module
cargo test text::
Manual Testing
# Run with verbose logging
cargo run -- -vv
# Test transcription on audio file
cargo run -- transcribe test.wav
# Watch state changes
cargo run -- status --follow
# Test model download
cargo run -- setup --download
Code Quality
Before submitting a PR:
# Format code
cargo fmt
# Run linter
cargo clippy
# Fix clippy warnings
cargo clippy --fix
Submitting Changes
For Bug Fixes
- Create an issue describing the bug
- Fork the repository
- Create a branch:
git checkout -b fix/description
- Make your fix
- Test thoroughly
- Submit a pull request referencing the issue
For Features
- Open an issue to discuss the feature first
- Wait for feedback before investing significant time
- Fork and create a branch:
git checkout -b feature/description
- Implement the feature
- Add tests and documentation
- Submit a pull request
Commit Messages
Use clear, descriptive commit messages:
type: short description
Longer description if needed. Explain what and why,
not how (the code shows how).
Fixes #123
Types: fix, feat, docs, style, refactor, test, chore
Pull Request Guidelines
- Keep PRs focused - One feature or fix per PR
- Write tests - Add tests for new functionality
- Update docs - Update relevant documentation
- Follow code style - Run
cargo fmt and cargo clippy
- Test on your system - Verify it works before submitting
- Provide context - Explain why the change is needed
Documentation
When adding user-facing features, update:
- User Manual (
docs/USER_MANUAL.md) - How to use the feature
- Configuration Guide (
docs/CONFIGURATION.md) - Config file options
- Troubleshooting (
docs/TROUBLESHOOTING.md) - Common issues and solutions
- CLI help - Update
src/cli.rs with clear --help text
- Website docs - Update corresponding pages in
website/
Code Style
Rust Conventions
- Use
snake_case for functions and variables
- Use
PascalCase for types and structs
- Use
SCREAMING_SNAKE_CASE for constants
- Keep lines under 100 characters
- Use descriptive variable names
Error Handling
Use thiserror with user-friendly messages:
#[error("Cannot open input device '{0}'. Is the user in the 'input' group?\n Run: sudo usermod -aG input $USER")]
DeviceAccess(String),
Include:
- What went wrong
- Why it happened
- How to fix it
Logging
Use tracing for structured logging:
use tracing::{info, debug, warn, error};
info!("Starting daemon");
debug!(device = %device_name, "Opening audio device");
warn!("Model not found, downloading...");
error!(?err, "Transcription failed");
Adding New Features
Adding a Transcription Backend
- Create
src/transcribe/your_backend.rs
- Implement the
Transcriber trait:
#[async_trait]
pub trait Transcriber: Send + Sync {
async fn prepare(&mut self) -> Result<()>;
async fn transcribe(&mut self, audio: &[i16]) -> Result<String>;
}
- Add to factory in
src/transcribe/mod.rs
- Add config fields to
src/config.rs with defaults
- Add CLI flags in
src/cli.rs
- Document in
docs/CONFIGURATION.md
- Add tests
Adding an Output Method
- Create
src/output/your_method.rs
- Implement the
TextOutput trait:
#[async_trait]
pub trait TextOutput: Send + Sync {
async fn output(&mut self, text: &str) -> Result<()>;
fn is_available(&self) -> bool;
}
- Add to fallback chain in
src/output/mod.rs
- Consider whether it should be automatic fallback or explicit selection
Modifying Configuration
Backwards compatibility is critical. Never break existing installations.
- Add new fields with sensible defaults
- Removed fields should be silently ignored, not cause errors
- Test upgrades by running new version with old config file
- Document changes in release notes
Testing Guidelines
Unit Tests
Place tests at the bottom of each file:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_something() {
// Test implementation
}
}
Integration Tests
Place in tests/ directory:
use voxtype::*;
#[tokio::test]
async fn test_end_to_end() {
// Test implementation
}
Best Practices
- Prefer self-documenting code over comments
- Avoid allocations in hot paths (hotkey detection, audio streaming)
- Use
spawn_blocking for CPU-intensive work (transcription)
- Prefer streaming over buffering where possible
- Only validate at system boundaries (user input, external APIs)
- Don’t over-engineer - wait for multiple use cases before abstracting
- Desktop: Prioritize fast transcription
- Laptop: Balance speed with battery efficiency
- Memory: Use on-demand loading for large models
- GPU: Provide isolation option for memory management
Getting Help
Maintainer Response Time
I typically respond to issues and PRs within 48-72 hours. For urgent bugs affecting core functionality, mention @peteonrails in your issue.
Recognition
If you submit code that is incorporated into Voxtype, you will be:
- Credited in commit messages with
Co-authored-by
- Added to the Contributors section in README
- Listed on the website
Bonus points if you include these updates in your pull request!
Code of Conduct
Please read our Code of Conduct. We are committed to providing a welcoming and positive experience for everyone.
License
By contributing, you agree that your contributions will be licensed under the MIT License.
Additional Resources