Skip to main content
BinaryDB uses Python’s pickle module for serialization. Never load database files from untrusted sources. Pickle can execute arbitrary code during deserialization.

Understanding Pickle Security Risks

The Fundamental Problem

BinaryDB uses Python’s pickle module to serialize data to disk. While pickle is convenient and supports any Python object, it has a critical security flaw: arbitrary code execution. When unpickling data, Python can execute code embedded in the serialized data. A malicious actor can craft a pickle file that executes arbitrary commands on your system.
from binarydb.database import Database

# DANGEROUS: Loading a database from an untrusted source
db = Database("/tmp/untrusted_database")
db.load()  # This could execute malicious code!

Security Notice from Database

The Database class includes this warning in its docstring:
"""
Persistent key-value database.

This database stores all records in memory and serializes them to disk
using pickle. It is designed for small to medium datasets where
simplicity, reliability and embeddability are preferred over performance.

⚠️ Security notice:
    Never load database files from untrusted sources. Pickle is unsafe.
"""
This is not a theoretical risk - it’s a well-known and easily exploitable vulnerability.

Unsafe Scenarios

Never use BinaryDB in these scenarios:

Loading User-Uploaded Files

# NEVER DO THIS
from binarydb.database import Database

# User uploads a .pkl file
uploaded_file = request.files['database']
uploaded_file.save('/tmp/user_db.pkl')

db = Database('/tmp/user_db')
db.load()  # DANGEROUS: User can execute arbitrary code

Accepting Databases from Network Sources

# NEVER DO THIS
import requests

# Download database from remote source
response = requests.get('https://example.com/database.pkl')
with open('/tmp/remote_db.pkl', 'wb') as f:
    f.write(response.content)

db = Database('/tmp/remote_db')
db.load()  # DANGEROUS: Remote attacker can execute code

Processing Databases from Untrusted Users

# NEVER DO THIS
from binarydb.database import Database

# Load database from shared directory
db = Database('/shared/untrusted/user_database')
db.load()  # DANGEROUS: Any user with write access can exploit this

Safe Usage Patterns

Trusted, Controlled Environments

BinaryDB is safe when used in controlled environments where you control all data sources:
from binarydb.database import Database

# SAFE: Application-controlled database
db = Database('./data/app_database')
db.load()  # Safe - you control this file

db.set('config:version', '1.0.0')
db.commit()
db.close()

Single-User Applications

# SAFE: Desktop application storing user preferences
from binarydb.database import Database
from pathlib import Path

# Store in user's home directory
config_dir = Path.home() / '.myapp'
config_dir.mkdir(exist_ok=True)

db = Database(config_dir / 'settings')
db.load()  # Safe - user controls their own files

db.set('theme', 'dark')
db.set('language', 'en')
db.commit()
db.close()

Internal Tools and Scripts

# SAFE: Internal automation script
from binarydb.database import Database

# Load database from secure, internal location
db = Database('/opt/internal/cache_db')
db.load()  # Safe - controlled internal infrastructure

db.set('last_run', '2026-03-04')
db.commit()
db.close()

File System Security

File Permissions

Even in trusted environments, protect your database files with appropriate permissions:
# Set restrictive permissions on database files
chmod 600 data/mydb.pkl  # Owner read/write only
chmod 700 data/          # Owner access only to directory
from binarydb.database import Database
from pathlib import Path
import os

# Create database with restricted permissions
db_path = Path('./data/sensitive')
db_path.parent.mkdir(mode=0o700, exist_ok=True)

db = Database(db_path)
db.set('api_key', 'secret-key-12345')
db.commit()

# Ensure file has restrictive permissions
os.chmod(f"{db_path}.pkl", 0o600)

db.close()

Secure Storage Locations

Store database files in secure locations:

Good Locations

  • Application data directories with restricted access
  • User home directories (for single-user apps)
  • System directories with proper permissions (/opt, /var/lib)

Bad Locations

  • Shared directories (/tmp, /shared)
  • Web-accessible directories (/var/www)
  • World-writable directories
  • Network-mounted filesystems with untrusted users
from binarydb.database import Database
from pathlib import Path

# GOOD: Secure locations
good_paths = [
    Path.home() / '.myapp' / 'data',           # User home
    Path('/var/lib/myapp/data'),                # System data
    Path('/opt/myapp/data'),                    # Application directory
]

# BAD: Insecure locations
bad_paths = [
    Path('/tmp/data'),                          # Shared, writable
    Path('/var/www/html/data'),                 # Web-accessible
    Path('/shared/data'),                       # Untrusted users
]

Validating Data Sources

Check File Ownership and Permissions

Before loading a database, verify the file is owned and controlled by your application:
from binarydb.database import Database
from pathlib import Path
import os
import stat

def safe_load_database(path: str) -> Database:
    db_path = Path(path).with_suffix('.pkl')
    
    if db_path.exists():
        # Check file ownership
        file_stat = db_path.stat()
        if file_stat.st_uid != os.getuid():
            raise PermissionError("Database file owned by different user")
        
        # Check permissions (should not be world-writable)
        if file_stat.st_mode & stat.S_IWOTH:
            raise PermissionError("Database file is world-writable")
    
    db = Database(path)
    db.load()
    return db

# Usage
db = safe_load_database('./data/mydb')
db.close()

Verify File Integrity

For additional security, implement checksum verification:
from binarydb.database import Database
import hashlib
from pathlib import Path

def calculate_checksum(file_path: Path) -> str:
    sha256_hash = hashlib.sha256()
    with file_path.open("rb") as f:
        for byte_block in iter(lambda: f.read(4096), b""):
            sha256_hash.update(byte_block)
    return sha256_hash.hexdigest()

def load_with_checksum(path: str, expected_checksum: str) -> Database:
    db_path = Path(path).with_suffix('.pkl')
    
    if db_path.exists():
        actual_checksum = calculate_checksum(db_path)
        if actual_checksum != expected_checksum:
            raise ValueError("Database checksum mismatch - possible tampering")
    
    db = Database(path)
    db.load()
    return db

Alternatives for Untrusted Data

When dealing with untrusted data sources, use safer serialization formats:

JSON for Untrusted Data

import json
from pathlib import Path

class SafeDatabase:
    """Alternative to BinaryDB that uses JSON instead of pickle."""
    
    def __init__(self, path: str):
        self.path = Path(path).with_suffix('.json')
        self.data = {}
    
    def load(self):
        if self.path.exists():
            with self.path.open('r') as f:
                self.data = json.load(f)  # Safe - no code execution
    
    def save(self):
        with self.path.open('w') as f:
            json.dump(self.data, f, indent=2)
    
    def set(self, key: str, value):
        self.data[key] = value
    
    def get(self, key: str, default=None):
        return self.data.get(key, default)

# SAFE: Can handle untrusted data
db = SafeDatabase('/tmp/untrusted_data')
db.load()  # Safe - JSON cannot execute code
db.set('key', 'value')
db.save()

SQLite for Complex Data

import sqlite3

# SAFE: SQLite with untrusted data
conn = sqlite3.connect('/tmp/untrusted.db')
cursor = conn.cursor()

cursor.execute('''
    CREATE TABLE IF NOT EXISTS data (
        key TEXT PRIMARY KEY,
        value TEXT
    )
''')

cursor.execute('INSERT OR REPLACE INTO data VALUES (?, ?)', ('key', 'value'))
conn.commit()
conn.close()

Security Checklist

  • Database files are stored in secure, application-controlled locations
  • File permissions are set to restrict access (600 or 700)
  • Database files are never loaded from user input or uploads
  • Database files are never loaded from network sources
  • All database paths are validated and sanitized
  • Regular security audits of file permissions
  • Documented access controls for database files
  • Alternative storage considered for any untrusted data

Summary

BinaryDB is safe and useful for controlled environments where you manage all data sources. It is not safe for handling untrusted data.
Key Takeaways:
  1. Never load pickle files from untrusted sources - this can lead to arbitrary code execution
  2. Use BinaryDB only in controlled environments - single-user apps, internal tools, controlled servers
  3. Set restrictive file permissions - protect database files from unauthorized access
  4. Use alternatives for untrusted data - JSON, SQLite, or other safe formats
  5. Validate file ownership and permissions - verify files before loading
  6. Store databases in secure locations - avoid shared or web-accessible directories

See Also

Build docs developers (and LLMs) love