Skip to main content
sysctl provides a mechanism for reading and modifying kernel parameters while the system is running. Parameters are exposed as files under /proc/sys/, organized into subdirectories corresponding to kernel subsystems.

How sysctl works

The /proc/sys/ directory is a virtual filesystem: reading a file returns the current value of a kernel variable, and writing to it changes that variable immediately. No reboot is required.
/proc/sys/
├── kernel/    # global kernel settings
├── vm/        # virtual memory management
├── net/       # networking stack
├── fs/        # filesystem limits
├── dev/       # device-specific settings
└── user/      # per-user namespace limits
Most writes to /proc/sys/ require root privileges or CAP_SYS_ADMIN. Changes take effect immediately but are not persistent across reboots unless written to a configuration file.

Reading and writing values

# Read a single parameter
sysctl kernel.pid_max

# Read all parameters
sysctl -a

# Write a value
sysctl -w kernel.pid_max=65536

# Load all settings from sysctl.conf
sysctl -p

# Load settings from a specific file
sysctl -p /etc/sysctl.d/99-custom.conf

Persistent configuration

Changes made with sysctl -w or by writing to /proc/sys/ are lost on reboot. To persist settings, write them to a configuration file.
1

Create a drop-in configuration file

Place a .conf file in /etc/sysctl.d/. Files are processed in lexicographic order; use a numeric prefix to control ordering:
# /etc/sysctl.d/99-network.conf
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.core.somaxconn = 1024
2

Apply the configuration

# Apply a specific file
sysctl -p /etc/sysctl.d/99-network.conf

# Apply all files in /etc/sysctl.d/
sysctl --system
3

Verify the change persisted

sysctl net.ipv4.tcp_syncookies
# net.ipv4.tcp_syncookies = 1
On systemd systems, systemd-sysctl.service applies all files under /etc/sysctl.d/, /run/sysctl.d/, and /usr/lib/sysctl.d/ at boot, in the order: usr/lib/run/etc/, with later files overriding earlier ones.

Configuration file format

# Comments begin with # or ;
; This is also a comment

# key = value
kernel.pid_max = 4194304

# Wildcards are supported with --system
# net.ipv4.conf.*.rp_filter = 1

Parameters by category

kernel.*

Global kernel behavior and resource limits.
The maximum process ID number that can be allocated. When the kernel reaches this value, PID allocation wraps around to low values.
  • Default: 32768 (32-bit PID range)
  • Maximum: 4194304 (2^22) on 64-bit systems
# Increase for large-scale systems running many processes
sysctl -w kernel.pid_max=4194304
The maximum size (in bytes) of a single shared memory segment that a process can create. This is a hard limit applied per shmget() call.
  • Default: varies by architecture; on x86-64 it is typically ULONG_MAX - (1 << 24)
  • The associated kernel.shmall sets the total number of shared memory pages system-wide.
# Allow 8 GiB shared memory segments (for PostgreSQL, Oracle DB, etc.)
sysctl -w kernel.shmmax=8589934592
sysctl -w kernel.shmall=2097152   # total pages: 8GiB / 4KiB
Controls which users can read the kernel ring buffer via dmesg(1).
ValueBehavior
0No restrictions (default)
1Requires CAP_SYS_ADMIN or CAP_SYSLOG
# Prevent unprivileged users from reading kernel messages
sysctl -w kernel.dmesg_restrict=1
Controls whether kernel pointer values are exposed to unprivileged users in files like /proc/kallsyms and via the %pK printk format specifier.
ValueBehavior
0Pointers are hashed (randomized) before printing
1Hashed for unprivileged; raw for processes with CAP_SYSLOG
2Always print 0 (fully hidden)
sysctl -w kernel.kptr_restrict=2
Controls the verbosity of kernel messages sent to the console. Contains four values: console_loglevel, default_message_loglevel, minimum_console_loglevel, default_console_loglevel.
cat /proc/sys/kernel/printk
# 4  4  1  7
#  ^  ^  ^  ^
#  |  |  |  default console loglevel
#  |  |  minimum console loglevel
#  |  default message loglevel
#  current console loglevel

# Suppress most messages during normal operation
sysctl -w kernel.printk="3 4 1 3"

# Enable all messages for debugging
sysctl -w kernel.printk="8 4 1 7"
Controls the scope of access to perf_event_open(2) for unprivileged users.
ValueBehavior
-1No restrictions
0Allow access to CPU data, but not tracepoints
1Allow only user-space measurements (default)
2Disallow all access for unprivileged users
3Disallow even user-space measurements for unprivileged users
# Restrict perf to privileged users only
sysctl -w kernel.perf_event_paranoid=2
Controls Address Space Layout Randomization (ASLR).
ValueBehavior
0Disabled
1Randomize stack, VDSO, and mmap addresses
2Also randomize heap (brk) addresses (default)
# Full ASLR (recommended)
sysctl -w kernel.randomize_va_space=2
When set to 1, prevents any further kernel modules from being loaded. This is a one-way lock: once set to 1, it cannot be changed back without rebooting.
# Lock down module loading after boot (for high-security environments)
sysctl -w kernel.modules_disabled=1
Setting this to 1 is irreversible for the current boot session. Ensure all necessary modules are already loaded.

vm.*

Virtual memory management and page cache behavior.
Controls how aggressively the kernel swaps anonymous memory pages to disk relative to reclaiming pages from the page cache. Higher values favor swapping; lower values favor keeping processes in RAM.
  • Range: 0–200
  • Default: 60
  • Value 0: avoid swapping unless absolutely necessary (does not disable swap entirely)
  • Value 100: balance between swapping and page cache reclaim
  • Value 200: aggressively swap anonymous memory
# Desktop/interactive systems: reduce swapping
sysctl -w vm.swappiness=10

# Database servers: minimize swapping
sysctl -w vm.swappiness=1
vm.dirty_ratio: The maximum percentage of system memory that can be filled with dirty (unwritten) pages before a process writing data is itself blocked to flush pages.vm.dirty_background_ratio: The percentage of system memory at which background writeback of dirty pages begins. Background flushing starts at this threshold, keeping dirty_ratio from being reached.
  • Default dirty_ratio: 20
  • Default dirty_background_ratio: 10
# Reduce write stall latency on systems with large RAM
sysctl -w vm.dirty_background_ratio=5
sysctl -w vm.dirty_ratio=10
On systems with very large RAM, the ratio-based settings may allow hundreds of gigabytes of dirty data. Use vm.dirty_bytes and vm.dirty_background_bytes for absolute limits instead.
Controls how the kernel handles memory overcommitment—allocating more virtual memory than physical RAM + swap available.
ValueModeDescription
0HeuristicKernel uses heuristics to allow reasonable overcommit (default)
1Always overcommitAll allocations succeed; OOM killer handles shortfalls
2Never overcommitTotal committed memory cannot exceed overcommit_ratio% of RAM + swap
# Production database servers: prevent overcommit
sysctl -w vm.overcommit_memory=2
sysctl -w vm.overcommit_ratio=80

# Redis/in-memory workloads: allow overcommit for fork-based persistence
sysctl -w vm.overcommit_memory=1
The minimum amount of free memory (in KiB) that the kernel tries to maintain. If free memory drops below this level, the kernel invokes the memory reclaim path.Increasing this value can help systems with high memory pressure avoid stalls, but setting it too high wastes memory.
# On a 64 GiB system, set a 1 GiB minimum
sysctl -w vm.min_free_kbytes=1048576
Controls the tendency of the kernel to reclaim memory used for dentries and inodes (filesystem metadata caches) relative to pagecache and swap.
  • Value 100: kernel reclaims dentries and inodes at the same rate as page cache (default)
  • Value < 100: kernel prefers to keep dentry/inode caches
  • Value > 100: kernel more aggressively reclaims dentry/inode memory
# Filesystem-heavy workloads: retain more dentry/inode cache
sysctl -w vm.vfs_cache_pressure=50

net.core.*

Core network stack parameters shared across all protocols.
The maximum length of the listen backlog queue for TCP sockets. This limits the maximum number of connections that can be queued waiting to be accepted by a listen()ing socket.
  • Default: 4096 (raised from 128 in kernel 5.4)
# High-traffic web servers
sysctl -w net.core.somaxconn=65535
Applications must also set a large backlog argument to listen(2). The effective limit is min(backlog, somaxconn).
The maximum number of packets allowed to queue on a network interface’s input side when the interface receives frames faster than the kernel can process them.
  • Default: 1000
# High-throughput servers with fast NICs
sysctl -w net.core.netdev_max_backlog=5000
The maximum socket receive and send buffer sizes (in bytes) that applications can request via SO_RCVBUF and SO_SNDBUF.
# Allow 4 MiB buffers for high-bandwidth connections
sysctl -w net.core.rmem_max=4194304
sysctl -w net.core.wmem_max=4194304

net.ipv4.*

IPv4 networking parameters.
Enables TCP SYN cookies, a defense against SYN flood denial-of-service attacks. When the listen backlog queue is full, the kernel generates a cryptographic cookie in the SYN-ACK response instead of allocating state. Only legitimate connections that complete the handshake are then accepted.
  • Default: 1 (enabled)
sysctl -w net.ipv4.tcp_syncookies=1
Keep this enabled on any internet-facing system. Disabling it makes the system trivially vulnerable to SYN flood attacks.
Enables IP packet forwarding between interfaces. Required for routers, NAT gateways, VPN servers, and container/VM hosts.
  • Default: 0 (disabled)
# Enable routing/forwarding
sysctl -w net.ipv4.ip_forward=1

# Also available per-interface:
sysctl -w net.ipv4.conf.eth0.forwarding=1
The number of seconds to wait for a final FIN packet before forcibly closing a TCP connection stuck in the FIN_WAIT_2 state.
  • Default: 60 seconds
# Reduce to reclaim sockets faster on busy servers
sysctl -w net.ipv4.tcp_fin_timeout=30
Allows reusing TIME_WAIT sockets for new outgoing connections when it is safe to do so (from a protocol perspective).
  • Default: 2 (enabled for loopback only)
sysctl -w net.ipv4.tcp_tw_reuse=1
Enables Reverse Path Filtering (RPF), which drops packets whose source address is not reachable via the interface they arrived on. This prevents IP spoofing attacks.
ValueMode
0Disabled
1Strict mode: drop if not on best route back
2Loose mode: drop if no route back at all
sysctl -w net.ipv4.conf.all.rp_filter=1
sysctl -w net.ipv4.conf.default.rp_filter=1

fs.*

Filesystem and file descriptor limits.
The system-wide maximum number of open file handles (file descriptors). When this limit is reached, attempts to open files return ENFILE.
# View current usage and limit
cat /proc/sys/fs/file-nr
# allocated  free  max
# 17984      0     9223372036854775807

# Increase the limit
sysctl -w fs.file-max=2097152
Per-process file descriptor limits are controlled separately via ulimit -n and /etc/security/limits.conf (pam_limits).
Controls the inotify filesystem event notification subsystem, used by file managers, IDE plugins, build watchers, and many other tools.fs.inotify.max_user_watches: Maximum number of inotify watches a single user can create.
  • Default: 8192
fs.inotify.max_user_instances: Maximum number of inotify instances (file descriptors) per user.
  • Default: 128
fs.inotify.max_queued_events: Maximum number of queued events per inotify instance.
  • Default: 16384
# Large codebases monitored by editors (VS Code, JetBrains IDEs)
sysctl -w fs.inotify.max_user_watches=524288
sysctl -w fs.inotify.max_user_instances=512
Controls whether core dumps are generated for setuid/setgid executables.
ValueBehavior
0No core dump for setuid programs (default)
1Core dumps enabled for all programs (insecure)
2Core dumps created only if core_pattern uses a pipe to a handler
# Secure setting (default)
sysctl -w fs.suid_dumpable=0

Useful sysctl snippets

# Restrict kernel pointer exposure
kernel.kptr_restrict = 2

# Restrict dmesg to privileged users
kernel.dmesg_restrict = 1

# Restrict perf_event access
kernel.perf_event_paranoid = 2

# Full ASLR
kernel.randomize_va_space = 2

# Prevent core dumps from setuid programs
fs.suid_dumpable = 0

Build docs developers (and LLMs) love