Architecture Overview
The SerenityOS kernel follows a modular monolithic architecture, where all kernel services run in privileged mode but are organized into well-defined subsystems with clear interfaces.Directory Structure
The kernel source code is organized by functionality:Core Subsystems
Task Management
Located inKernel/Tasks/, this subsystem handles all aspects of process and thread execution.
Process Management
Key Classes:Process, ProcessGroup
The Process class (defined in Kernel/Tasks/Process.h) represents an executing program with:
- Address Space: Virtual memory regions via
Memory::AddressSpace - Credentials: User/group IDs and capabilities via
Security::Credentials - File Descriptors: Open file descriptions managed per-process
- Threads: One or more threads of execution
- Contexts: VFS root, hostname, and process list contexts
Processes support pledge/unveil security mechanisms that restrict system call access and filesystem visibility.
Process Security Features
Process Security Features
Pledge: Restricts which system calls a process can use:
stdio,rpath,wpath,cpath- Basic I/O operationsinet,unix- Network accessproc,exec- Process managementptrace- Debugging capabilities- And many more defined in
ENUMERATE_PLEDGE_PROMISES
Thread Management
Key Class:Thread (in Kernel/Tasks/Thread.h)
Threads are the schedulable units of execution:
- Each thread has its own kernel stack and register state
- Thread-specific data (TSD) support
- Priority-based scheduling (0-99 range)
- CPU affinity support for SMP systems
- Thread blockers for synchronization
- Runnable: Ready to execute
- Running: Currently executing
- Blocked: Waiting on a resource
- Dying: Terminating
- Dead: Finished execution
Scheduler
Key Class:Scheduler (in Kernel/Tasks/Scheduler.cpp)
The scheduler implements priority-based multi-level queue scheduling:
- Priority Queues: 32 priority levels mapped from thread priorities
- CPU Affinity: Threads can be bound to specific CPUs
- Fair Scheduling: Threads rotate within their priority level
- Idle Thread: Lowest priority, runs when no work available
The scheduler uses spinlock-protected ready queues (
g_ready_queues) to safely manage runnable threads across multiple CPUs.Memory Management
Located inKernel/Memory/, this subsystem provides virtual and physical memory management.
Physical Memory
Key Classes:PhysicalRAMPage, PhysicalRegion, PhysicalZone, MemoryManager
- Physical Pages: 4KB units tracked individually
- Physical Regions: Contiguous ranges of physical memory
- Physical Zones: NUMA-aware memory allocation zones
- Memory Types: Usable, Reserved, ACPI_Reclaimable, ACPI_NVS, BadMemory
Virtual Memory
Key Classes:AddressSpace, Region, VMObject
Each process has its own address space containing memory regions:
- Anonymous Regions: Backed by
AnonymousVMObject(heap, stack) - Inode-backed Regions: Backed by files via
InodeVMObject - MMIO Regions: Memory-mapped I/O via
MMIOVMObject - Shared Framebuffer: GPU framebuffers via
SharedFramebufferVMObject
Regions can be shared between processes using shared VM objects, enabling efficient IPC and memory-mapped files.
Page Fault Handling
Key File:Kernel/Arch/PageFault.cpp
Page faults occur when accessing unmapped or protected memory:
- Hardware triggers page fault with faulting address
- Kernel identifies the region containing the address
- VM object handles the fault:
- Allocate physical page if needed
- Read from backing store (for file-backed pages)
- Apply copy-on-write if necessary
- Update page tables and resume execution
File System Layer
Located inKernel/FileSystem/, this implements the Virtual File System and concrete filesystems.
Virtual File System (VFS)
Key Classes:VirtualFileSystem, Inode, Custody, Mount
The VFS provides a unified interface for all filesystems:
- Inodes: Represent filesystem objects (files, directories, devices)
- Custody: Path-based reference to an inode with parent tracking
- Mounts: Filesystem instances mounted at specific points
- VFSRootContext: Per-container mount tables and root directories
Supported Filesystems
Supported Filesystems
Disk Filesystems:
- Ext2FS: Linux ext2 filesystem
- FATFS: FAT12/16/32 filesystem
- ISO9660FS: CD-ROM filesystem
- Plan9FS: Plan 9 protocol filesystem
- FUSE: Userspace filesystem support
- ProcFS: Process information (
/proc) - SysFS: Kernel information (
/sys) - DevPtsFS: Pseudo-terminals (
/dev/pts) - DevLoopFS: Loop devices (
/dev/loop) - RAMFS: In-memory filesystem
File Descriptors
Key Classes:OpenFileDescription, File
Each open file is represented by:
- OpenFileDescription: File state (position, flags, inode/socket/device)
- File Descriptor Table: Per-process mapping of FD numbers to descriptions
- File Types: Regular files, directories, devices, sockets, FIFOs
Device Driver Framework
Located inKernel/Devices/, this provides infrastructure for device drivers.
Device Hierarchy
Device::try_create_device<T>() which:
- Constructs the device object
- Calls
after_inserting()for registration - Exposes the device in
/dev/
- Major numbers identify device families (allocated in
Kernel/API/MajorNumberAllocation.h) - Minor numbers identify specific devices within a family
Device major numbers are allocated per device family with strict rules for naming and ordering to maintain consistency across the system.
Interrupt Management
Located inKernel/Interrupts/, this handles hardware interrupt delivery.
Interrupt Handlers
Key Classes:GenericInterruptHandler, IRQHandler, SharedIRQHandler
- Generic Handlers: Base for all interrupt handlers
- IRQ Handlers: Handle device interrupts
- Shared Handlers: Multiple devices sharing one IRQ line
- Spurious Handlers: Detect and handle spurious interrupts
- Unhandled Handlers: Catch unexpected interrupts
Architecture-Specific Controllers
- x86_64: APIC (Advanced Programmable Interrupt Controller) and legacy PIC
- aarch64: GIC (Generic Interrupt Controller)
- riscv64: PLIC (Platform-Level Interrupt Controller)
Networking Stack
Located inKernel/Net/, implementing TCP/IP networking.
Key Components:
- Socket Layer: Abstract socket interface
- Protocol Implementations: IPv4, TCP, UDP, ICMP
- Unix Domain Sockets: Local IPC sockets
- Network Task: Kernel thread for async network operations
- Routing: IP routing table management
Locking and Synchronization
Located inKernel/Locking/, providing thread-safe primitives.
Lock Types
Spinlocks:- Busy-wait locks for short critical sections
- Used in interrupt context
- Architecture-specific implementation
- Sleeping locks for longer operations
- Thread blocking on contention
- Cannot be used in interrupt context
Critical Rule: Never acquire a mutex after taking a spinlock. Spinlocks after spinlocks are okay if taken in consistent order to prevent deadlocks.
Protected Containers
Lock Ranking
To prevent deadlocks, locks have ranks defined inKernel/Locking/LockRank.h. Locks must be acquired in order of decreasing rank.
System Call Interface
Located inKernel/Syscalls/, implementing the kernel-userspace boundary.
Syscall Handling
Key File:Kernel/Syscalls/SyscallHandler.cpp
- Userspace invokes syscall via architecture-specific instruction
- CPU switches to kernel mode
- Handler validates arguments and permissions
- Implementation performs the requested operation
- Return back to userspace with result or error
Design Principles
- Follow POSIX standards where applicable
- Avoid architecture-specific syscalls
- Prefer existing interfaces over new syscalls
- No hardcoded paths in syscall implementations
Common System Calls
Common System Calls
File Operations:
open, read, write, close, ioctl, fcntlProcess Management: fork, execve, exit, wait, killMemory: mmap, munmap, mprotect, brkSignals: sigaction, sigprocmask, killNetworking: socket, bind, connect, send, recvTime: clock_gettime, nanosleep, alarmBus Support
Located inKernel/Bus/, supporting various hardware buses.
PCI (Peripheral Component Interconnect)
- Device enumeration and configuration
- MSI (Message Signaled Interrupts) support
- Memory-mapped configuration access
- Volume Management Device support
USB (Universal Serial Bus)
- Controllers: UHCI, EHCI, xHCI
- Device Management: USB hub support, device enumeration
- Drivers: HID (keyboard/mouse), Mass Storage
- Transfers: Control, bulk, interrupt, isochronous
VirtIO
Para-virtualized device support for:- Network devices
- Block devices
- GPU devices
- Input devices
Firmware Interfaces
Located inKernel/Firmware/.
ACPI (Advanced Configuration and Power Interface)
Platforms: x86_64 Provides:- Hardware discovery
- Power management
- Interrupt routing
- Thermal management
Device Tree
Platforms: aarch64, riscv64 Provides:- Hardware description from bootloader
- Device enumeration
- Platform-specific initialization
- Driver matching based on compatible strings
The kernel can parse both flattened device trees (FDT) from bootloaders and unflatten them into a tree structure for driver probing.
SMP (Symmetric Multi-Processing)
The kernel fully supports multi-core systems:- Per-CPU Data:
Processorclass represents each CPU - Scheduler: Per-CPU run queues with load balancing
- Interrupts: IRQ affinity and per-CPU interrupt handling
- Locking: Proper synchronization for shared data structures
- Memory: TLB shootdown for page table updates
Security Architecture
Credential System
Processes carry credentials for access control:- User ID (UID) and Group ID (GID)
- Effective, real, and saved UIDs/GIDs
- Supplementary groups
- Capabilities (work in progress)
Sandboxing
- Pledge: System call filtering
- Unveil: Filesystem access restrictions
- Containers: Process isolation with VFS roots, process lists, hostnames
- Jails: Permanent isolation until process exit
Randomization
Located inKernel/Security/:
- Stack canaries (
__stack_chk_guard) - Address space layout randomization (ASLR)
- Kernel fast random number generation
Debugging Support
- Kernel Symbols: Symbol table loaded at boot from
KSyms.cpp - Debug Output:
dbgln()for kernel debugging messages - Coredumps: Generated for crashed processes
- Performance Events: Kernel profiling support
- KCOV: Code coverage for fuzzing
Next Steps
- Explore specific subsystems in detail
- Study the Development Guidelines
- Learn about Locking Patterns
- Understand container support in depth
