Skip to main content
BinaryDB is an open-source project under active development. Contributions are welcome and appreciated!
This project is currently in development and not yet complete. Many features are planned but not yet implemented.

Project Status

BinaryDB is being developed by Raúl Novo as a lightweight, embeddable key-value database. The project is functional for basic use cases but has several pending features.

Current State

The project is written in Python and available under the GPL-3.0 license. You can use, modify, and redistribute the code freely, but any redistributions must maintain the same license and credit the original author.

Repository Structure

The BinaryDB repository is organized as follows:
binarydb/
├── binarydb/
│   ├── __init__.py           # Package initialization
│   ├── database.py           # ✅ Core database implementation
│   ├── errors.py             # ✅ Custom exception classes
│   ├── cache.py              # 🚧 Minimal definitions (in progress)
│   ├── index.py              # ⏳ Pending - indexing structures
│   ├── lock.py               # ⏳ Pending - concurrency control
│   ├── transaction.py        # ⏳ Pending - transaction helpers
│   ├── utils.py              # ⏳ Pending - utility functions
│   └── wal.py                # ⏳ Pending - write-ahead logging
└── README.md                 # Project documentation

Completed Modules

database.py

Core Database class implementing in-memory key-value storage with pickle-based persistence. Includes basic transaction support (begin/rollback/end), atomic file writes, and complete CRUD operations.

errors.py

Custom exception hierarchy including DatabaseError, DatabaseIOError, DatabaseCorruptedError, KeyValidationError, RecordTypeError, TransactionError, and ConcurrencyError.

Modules in Development

Contains minimal error definitions for caching functionality. Purpose and full implementation need to be completed.
Status: Empty / PendingPlanned Features:
  • Secondary indexes for faster lookups
  • Index types (hash, B-tree, etc.)
  • Query optimization using indexes
  • Index persistence and recovery
Status: Empty / PendingPlanned Features:
  • File-based locking for multi-process access
  • Read/write lock support
  • Deadlock detection and prevention
  • Lock timeout handling
Status: Empty / PendingPlanned Features:
  • Enhanced transaction helpers and utilities
  • Nested transaction support
  • ACID compliance improvements
  • Transaction isolation levels
Note: Basic transaction support (begin/rollback/end) is already implemented in Database class (database.py:201-241).
Status: Empty / PendingPlanned Features:
  • Common utility functions
  • Data validation helpers
  • Serialization utilities
  • Performance monitoring tools
Status: Empty / PendingPlanned Features:
  • Write-ahead logging for crash recovery
  • Transaction durability guarantees
  • Point-in-time recovery
  • Log compaction and management

How to Contribute

Getting Started

  1. Fork the repository on GitHub
  2. Clone your fork locally:
    git clone https://github.com/YOUR_USERNAME/binarydb.git
    cd binarydb
    
  3. Create a feature branch:
    git checkout -b feature/my-contribution
    
  4. Make your changes and commit them:
    git add .
    git commit -m "Add: description of your changes"
    
  5. Push to your fork:
    git push origin feature/my-contribution
    
  6. Open a Pull Request on the main repository

Areas for Contribution

Implement Pending Features

Complete the pending modules: WAL, indexing, improved locking, caching, utilities. Each module has a clear scope and purpose.

Add Tests

Create comprehensive unit tests and integration tests. Test coverage is essential for reliability.

Improve Documentation

Add docstrings, examples, and guides. Help users understand and use BinaryDB effectively.

Fix Bugs

Report and fix bugs. Even small fixes are valuable contributions.

Performance Optimization

Optimize critical paths, reduce memory usage, improve commit performance.

Add Examples

Create example applications and use cases to help users get started.

Contribution Guidelines

Code Style

Follow Python best practices and maintain consistency with existing code:
# Use type hints
def my_function(key: str, value: Any) -> bool:
    """Clear docstring explaining the function."""
    pass

# Use descriptive variable names
db_path = Path("./data/mydb")
user_record = {"name": "Alice", "age": 30}

# Follow PEP 8 style guidelines
# Use 4 spaces for indentation
# Maximum line length: 88 characters (Black formatter)

Testing

All new features should include tests:
import unittest
from binarydb.database import Database
from pathlib import Path
import tempfile

class TestNewFeature(unittest.TestCase):
    def setUp(self):
        self.temp_dir = tempfile.mkdtemp()
        self.db_path = Path(self.temp_dir) / "test_db"
    
    def tearDown(self):
        # Clean up test files
        if self.db_path.with_suffix('.pkl').exists():
            self.db_path.with_suffix('.pkl').unlink()
    
    def test_feature(self):
        db = Database(self.db_path)
        # Test your feature
        self.assertTrue(True)
        db.close()

Documentation

Include docstrings for all public functions and classes:
def new_feature(self, key: str, value: Any) -> None:
    """
    Brief description of what this function does.
    
    Args:
        key: Description of the key parameter.
        value: Description of the value parameter.
    
    Raises:
        KeyValidationError: When the key is invalid.
        DatabaseError: When the database is closed.
    
    Example:
        >>> db = Database("./data/mydb")
        >>> db.new_feature("key", "value")
    """
    pass

Commit Messages

Write clear, descriptive commit messages:
# Good commit messages
git commit -m "Add: WAL implementation with log compaction"
git commit -m "Fix: Handle edge case in transaction rollback"
git commit -m "Update: Improve error messages for key validation"
git commit -m "Test: Add comprehensive tests for indexing module"

# Bad commit messages
git commit -m "fixed stuff"
git commit -m "updates"
git commit -m "wip"
Use prefixes:
  • Add: for new features
  • Fix: for bug fixes
  • Update: for improvements to existing features
  • Remove: for removing code
  • Test: for adding or updating tests
  • Docs: for documentation changes

Opening Issues

Before starting work on a feature, open an issue to discuss it:
Opening an issue before starting work helps avoid duplicate effort and ensures your contribution aligns with the project’s direction.
Good issue template:
## Feature Request: Add B-tree indexing

**Description:**
Implement B-tree-based indexing to support efficient range queries.

**Motivation:**
Current implementation only supports exact key lookups. Range queries
require scanning all keys, which is inefficient.

**Proposed Implementation:**
- Add BTreeIndex class in index.py
- Support range(), greater_than(), less_than() queries
- Persist indexes alongside database file

**Questions:**
- Should indexes be rebuilt automatically on load()?
- What should be the default node size?

Pull Request Process

  1. Ensure tests pass - All existing tests should still pass
  2. Add new tests - Cover your changes with tests
  3. Update documentation - Add docstrings and update guides
  4. Describe your changes - Write a clear PR description
  5. Link related issues - Reference issues your PR addresses
  6. Be responsive - Respond to review comments promptly
Good PR description:
## Implement Write-Ahead Logging

Closes #123

**Changes:**
- Implemented WAL class in wal.py
- Added log writing before each commit
- Added recovery mechanism in load()
- Added log compaction to prevent unbounded growth

**Testing:**
- Added 15 new tests covering WAL operations
- Tested crash recovery scenarios
- Verified performance impact is minimal

**Documentation:**
- Added docstrings to all public methods
- Updated README with WAL information

Development Setup

Local Development

# Clone the repository
git clone https://github.com/raulnovo/binarydb.git
cd binarydb

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .

# Install development dependencies (when added)
pip install -e ".[dev]"

# Run tests (when test suite is added)
python -m pytest tests/

Running the Examples

# Try the basic example
from binarydb.database import Database

db = Database("./datos/mydb")
db.load()
db.set("usuario:1", {"nombre": "Ana", "edad": 30})
print(db.get("usuario:1"))
db.commit()
db.close()

License

BinaryDB is licensed under GPL-3.0. All contributions must be compatible with this license.
By contributing to BinaryDB, you agree that your contributions will be licensed under the GNU General Public License v3.0. Key points:
  • You can freely use, modify, and redistribute the code
  • Redistributions must use the same GPL-3.0 license
  • You must preserve attribution to the original author (Raúl Novo)
  • No warranty is provided
For full license text, see: GNU GPL v3

Community

Getting Help

  • Open an issue for bugs or feature requests
  • Check existing issues before opening new ones
  • Be respectful and constructive in all interactions

Recognition

Contributors will be recognized in:
  • Project README
  • Release notes
  • Documentation credits

Questions?

If you have questions about contributing:
  1. Check existing issues and discussions
  2. Open a new issue with your question
  3. Be patient and respectful when waiting for responses
This is an open-source project maintained by volunteers. Response times may vary.

See Also

Build docs developers (and LLMs) love