Skip to main content

Diagnostic Tools

SmolVM includes built-in diagnostic capabilities to help you identify and resolve issues quickly.

Doctor Command

The smolvm doctor command validates your environment and reports any issues:
# Auto-detect backend (Darwin → qemu, Linux → firecracker)
smolvm doctor

# Force specific backend checks
smolvm doctor --backend firecracker
smolvm doctor --backend qemu

# CI-friendly mode with strict validation
smolvm doctor --json --strict
Run smolvm doctor first when experiencing issues. It checks KVM support, Firecracker/QEMU availability, network configuration, and permissions.

Common Issues

VM Creation and Startup

Cause: You’re attempting to create a VM with an ID that’s already in use.Solution: Either reconnect to the existing VM or use a different ID:
from smolvm import SmolVM
from smolvm.exceptions import VMAlreadyExistsError

try:
    vm = SmolVM(vm_id="my-vm")
except VMAlreadyExistsError:
    # Reconnect to existing VM
    vm = SmolVM.from_id("my-vm")
Or clean up the existing VM:
smolvm cleanup --vm-id my-vm
Cause: Attempting to reconnect to a VM that doesn’t exist or was already deleted.Solution: Verify the VM ID exists:
smolvm list
If the VM was deleted, create a new one:
from smolvm import SmolVM

vm = SmolVM()  # Creates new VM with auto-generated ID
vm.start()
Cause: Your Linux system doesn’t have KVM support enabled or accessible.Solution:
  1. Verify KVM is available:
    ls -la /dev/kvm
    
  2. Check if your CPU supports virtualization:
    egrep -c '(vmx|svm)' /proc/cpuinfo
    # Should return > 0
    
  3. Ensure KVM kernel modules are loaded:
    lsmod | grep kvm
    
  4. Add your user to the kvm group:
    sudo usermod -aG kvm $USER
    # Log out and back in
    
  5. Re-run the system setup:
    sudo ./scripts/system-setup.sh --configure-runtime
    
Cause: The VM failed to boot or become SSH-ready within the timeout period.Solution:
  1. Increase the timeout:
    from smolvm import SmolVM
    
    vm = SmolVM()
    vm.start()
    vm.wait_for_ssh(timeout=60.0)  # Increase from default
    
  2. Check system resources:
    # High CPU/memory usage can slow boot
    htop
    
  3. Verify the VM process is running:
    ps aux | grep firecracker
    
  4. Check Firecracker logs (if available):
    # Look for error messages
    journalctl -xe | grep firecracker
    

Network Issues

Cause: Insufficient permissions or network configuration issues.Solution:
  1. Verify sudo/root access:
    # SmolVM network operations require elevated privileges
    sudo -v
    
  2. Check for conflicting TAP devices:
    ip link show | grep tap
    
  3. Clean up stale network resources:
    smolvm cleanup --all
    
  4. Re-run system setup to configure network permissions:
    sudo ./scripts/system-setup.sh --configure-runtime
    
Cause: NAT or routing configuration issue.Solution:
  1. Verify NAT is configured:
    # Check nftables rules
    sudo nft list ruleset | grep masquerade
    
  2. Test connectivity from the VM:
    from smolvm import SmolVM
    
    with SmolVM() as vm:
        result = vm.run("ping -c 3 8.8.8.8")
        print(result.output)
    
  3. Check firewall rules:
    # Ensure forwarding is allowed
    sudo iptables -L FORWARD -v
    
Cause: Port already in use or forwarding rules not applied.Solution:
  1. Check if the host port is available:
    netstat -tuln | grep <host_port>
    
  2. Use a different port:
    from smolvm import SmolVM
    
    with SmolVM() as vm:
        # Let SmolVM choose an available port
        host_port = vm.expose_local(guest_port=8080)
        print(f"Accessible at http://localhost:{host_port}")
    
  3. Verify forwarding rules:
    sudo nft list ruleset | grep dnat
    

SSH and Command Execution

Cause: The VM profile doesn’t support command execution or SSH is not configured.Solution:Ensure you’re using auto-config (default) which sets up SSH:
from smolvm import SmolVM

# This automatically configures SSH
vm = SmolVM()
vm.start()
vm.wait_for_ssh()

result = vm.run("echo 'test'")
Cause: SSH service not started in the VM or connection blocked.Solution:
  1. Wait for SSH to be ready:
    vm.wait_for_ssh(timeout=30.0)
    
  2. Check SSH service status:
    # Try a simple command to verify connectivity
    result = vm.run("systemctl status ssh")
    print(result.output)
    
  3. Verify guest IP and SSH port:
    from smolvm import SmolVM
    
    vm = SmolVM.from_id("my-vm")
    print(f"Status: {vm.status}")
    print(f"IP: {vm.get_ip()}")
    
Cause: Long-running command or SSH connection issue.Solution:
  1. Run commands in the background for long operations:
    vm.run("long-running-task > /tmp/output.log 2>&1 &")
    
  2. Check command output for errors:
    result = vm.run("your-command")
    if result.exit_code != 0:
        print(f"Error: {result.stderr}")
    

Image and Storage Issues

Cause: Network issues, disk space, or corrupted image cache.Solution:
  1. Check available disk space:
    df -h ~/.local/share/smolvm
    
  2. Clear image cache and rebuild:
    rm -rf ~/.local/share/smolvm/images
    smolvm build --force
    
  3. Verify network connectivity:
    curl -I https://github.com  # Test internet access
    
Cause: Multiple VMs sharing the same disk in conflicting modes.Solution:Use isolated disk mode (default) for per-VM isolation:
from smolvm import SmolVM

# Each VM gets its own writable disk copy (recommended)
vm = SmolVM()  # disk_mode="isolated" by default
Or explicitly set shared mode if needed:
from smolvm import VMConfig

config = VMConfig(
    disk_mode="shared"  # Use with caution
)
vm = SmolVM(config=config)

Error Reference

SmolVM uses a structured exception hierarchy. All exceptions inherit from SmolVMError.

Exception Types

ExceptionDescriptionCommon Causes
ValidationErrorInput validation failedInvalid configuration, bad parameters
VMAlreadyExistsErrorVM ID already in useDuplicate VM creation
VMNotFoundErrorVM doesn’t existWrong ID, VM already deleted
NetworkErrorNetwork operation failedTAP device, NAT, IP allocation issues
HostErrorHost environment issueMissing KVM, dependencies, permissions
ImageErrorImage operation failedDownload, checksum, cache problems
FirecrackerAPIErrorFirecracker API call failedVM boot failure, API communication
OperationTimeoutErrorOperation exceeded timeoutSlow boot, network delays
CommandExecutionUnavailableErrorCommand execution not availableSSH not configured, wrong VM profile

Accessing Error Details

All exceptions include structured error information:
from smolvm import SmolVM
from smolvm.exceptions import SmolVMError, OperationTimeoutError

try:
    vm = SmolVM()
    vm.start()
    vm.wait_for_ssh(timeout=10.0)
except OperationTimeoutError as e:
    print(f"Operation: {e.operation}")
    print(f"Timeout: {e.timeout_seconds}s")
    print(f"Details: {e.details}")
except SmolVMError as e:
    print(f"Error: {e.message}")
    print(f"Details: {e.details}")

Debugging Techniques

Enable Verbose Logging

Increase logging verbosity to see detailed operations:
import logging
from smolvm import SmolVM

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('smolvm')
logger.setLevel(logging.DEBUG)

vm = SmolVM()
vm.start()

Inspect VM State

Check VM status and metadata:
from smolvm import SmolVM

vm = SmolVM.from_id("my-vm")
print(f"Status: {vm.status}")
print(f"VM ID: {vm._vm_id}")

# Get VM info from database
from smolvm.storage import StateManager

sm = StateManager()
vm_info = sm.get_vm("my-vm")
if vm_info:
    print(f"IP: {vm_info.guest_ip}")
    print(f"SSH Port: {vm_info.ssh_host_port}")

Check System State

List all VMs and their states:
# CLI
smolvm list

# Python
from smolvm.storage import StateManager

sm = StateManager()
vms = sm.list_vms()
for vm in vms:
    print(f"{vm.vm_id}: {vm.state}")

Clean Up Resources

Remove stale or problematic VMs:
# Remove specific VM
smolvm cleanup --vm-id my-vm

# Remove all VMs
smolvm cleanup --all

# Force cleanup (skip confirmation)
smolvm cleanup --all --force
cleanup --all will terminate and delete all VMs. Make sure you don’t have important workloads running.

Getting Help

If you’re still experiencing issues:
  1. Check the documentation: Review relevant guides and API reference
  2. Run diagnostics: smolvm doctor --json provides detailed system info
  3. Search issues: Look for similar problems in the GitHub issues
  4. Join the community: Get help on Slack
  5. Report bugs: Open a new issue with:
    • Output of smolvm doctor --json
    • Steps to reproduce
    • Expected vs actual behavior
    • Relevant logs and error messages

Next Steps

Performance

Optimize VM performance and throughput

Security Considerations

Secure your SmolVM deployments

Build docs developers (and LLMs) love