Sysctl Tuning: The Linux Kernel Knobs That Actually Matter

Your Kernel Has Settings and You’re Not Using Any of Them

Linux ships with conservative defaults. The kernel parameters are tuned for a wide range of hardware and workloads — a desktop, a database server, a router, a 4GB RAM NUC running four Docker containers — all reasonable, none optimal for any specific use case.

sysctl is the interface to these parameters. It lets you read and write kernel settings at runtime without rebooting, and persist them across reboots via configuration files. The settings live in the /proc/sys/ virtual filesystem, which means you can also read and write them directly:

# Read a parameter
sysctl vm.swappiness
# or directly:
cat /proc/sys/vm/swappiness

# Write a parameter (temporary, lost on reboot)
sysctl -w vm.swappiness=10
# or directly:
echo 10 > /proc/sys/vm/swappiness

That last method — writing directly to /proc/sys/ — is useful for quick tests but don’t rely on it for production changes. Use sysctl -w instead, so you’re being intentional about what you’re changing.

Making Changes Persistent

Temporary changes via sysctl -w are lost on reboot. To persist them, write to a file in /etc/sysctl.d/:

# Create a custom config file (the number prefix controls load order)
sudo nano /etc/sysctl.d/99-custom.conf

Add your parameters:

vm.swappiness = 10
net.core.somaxconn = 65535

Apply immediately without rebooting:

sudo sysctl --system
# or for a specific file:
sudo sysctl -p /etc/sysctl.d/99-custom.conf

Using /etc/sysctl.d/ instead of editing /etc/sysctl.conf directly is better practice — it keeps your customizations separate from distro-provided settings and makes it obvious which changes are yours.

Network Parameters That Actually Matter

`net.core.somaxconn`

This controls the maximum length of the connection queue for a socket — how many pending connections can be waiting to be accepted before the kernel starts refusing new ones.

The default is often 128 or 4096 depending on your distribution. For any server accepting significant traffic, this is too low:

net.core.somaxconn = 65535

Your web server or reverse proxy also needs to be configured to match — Nginx’s backlog in the listen directive, for example. The kernel limit and the application limit, the lower one wins.

`net.ipv4.tcp_tw_reuse`

When a TCP connection closes, the socket enters a TIME_WAIT state for 2 minutes (2MSL) to ensure any late packets from the old connection don’t interfere with a new one on the same port. Under heavy load — a server handling thousands of short-lived connections per second — you can exhaust ephemeral ports waiting for TIME_WAIT sockets to expire.

tcp_tw_reuse allows the kernel to reuse TIME_WAIT sockets for new outbound connections when it’s safe to do so:

net.ipv4.tcp_tw_reuse = 1

Important: this is safe for outbound connections (your server connecting to others). Don’t confuse it with tcp_tw_recycle, which was removed in kernel 4.12 because it broke connections through NAT.

Network Buffer Sizes

The default socket receive and send buffer sizes are sized for a network from a decade ago. On modern hardware with fast networks, increasing these allows the kernel to buffer more data in flight:

# Default socket receive buffer (bytes)
net.core.rmem_default = 262144

# Maximum socket receive buffer
net.core.rmem_max = 16777216

# Default socket send buffer
net.core.wmem_default = 262144

# Maximum socket send buffer
net.core.wmem_max = 16777216

# TCP receive buffer: min, default, max
net.ipv4.tcp_rmem = 4096 262144 16777216

# TCP send buffer: min, default, max
net.ipv4.tcp_wmem = 4096 262144 16777216

These numbers are a reasonable starting point for a server with 4-8GB RAM and a Gigabit+ network. For 10GbE or higher, you’d scale up further.

`net.ipv4.tcp_syncookies`

SYN flood protection. This is usually already enabled on modern distributions, but worth verifying:

net.ipv4.tcp_syncookies = 1

When the SYN backlog fills up, the kernel uses SYN cookies to complete the handshake without using backlog queue space. This mitigates SYN flood attacks. Leave it on.

`net.ipv4.ip_local_port_range`

The range of ports available for outbound connections. The default (32768-60999) gives you ~28,000 ephemeral ports. For servers making heavy outbound connections:

net.ipv4.ip_local_port_range = 1024 65535

This gives you ~64,000 ports. Combined with tcp_tw_reuse, this significantly increases the connection capacity for outbound-heavy workloads.

Memory Parameters

`vm.swappiness`

This is the one everyone knows about. vm.swappiness controls how aggressively the kernel uses swap space. The default is 60, which means the kernel is reasonably eager to move pages from RAM to swap even when there’s memory available — it prefers to keep RAM available for cache.

For a server where you want to prioritize keeping application data in RAM:

vm.swappiness = 10

Lower values make the kernel less eager to swap. Setting it to 0 doesn’t disable swap entirely — it just tells the kernel to avoid it unless absolutely necessary. The kernel will still use swap if RAM is actually exhausted.

For a system with no swap configured, this setting is irrelevant. For a desktop or system where swap responsiveness matters, 10-20 is common. For a database server where latency spikes from swap activity are catastrophic, 1-5 is common.

Setting it to 0 entirely was a common recommendation that Red Hat walked back — 1 is better than 0 because a completely swap-avoidant kernel can OOM-kill processes when there’s actually memory that could be reclaimed.

`vm.dirty_ratio` and `vm.dirty_background_ratio`

These control when the kernel flushes dirty pages (data written to filesystem but not yet synced to disk) to storage.

vm.dirty_background_ratio: percentage of total memory at which background flush starts
vm.dirty_ratio: percentage at which processes writing to the filesystem are blocked until flush completes

The defaults are 10% and 20% respectively. For a write-heavy workload on fast storage (NVMe), you can increase these to reduce the frequency of flush operations:

vm.dirty_ratio = 40
vm.dirty_background_ratio = 10

For a system where you can’t afford data loss (no UPS, writing to spinning rust), keep these lower so data reaches disk sooner.

`vm.vfs_cache_pressure`

Controls how aggressively the kernel reclaims memory used for directory and inode caches. The default is 100 (balanced). Lower values cause the kernel to favor keeping these caches:

vm.vfs_cache_pressure = 50

On a server with heavy filesystem operations (lots of file opens, directory lookups), keeping more inode/dentry cache can improve performance. On a memory-constrained system, leave this at default.

Container-Relevant Parameters

If you’re running Docker, these parameters affect container networking:

# Required for container networking (usually set by Docker automatically)
net.ipv4.ip_forward = 1

# Allow containers to use the bridge for routing
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1

# Maximum number of memory map areas (needed for Elasticsearch and some JVM apps)
vm.max_map_count = 262144

The vm.max_map_count one will save you a headache — Elasticsearch refuses to start with the default value and prints a helpful error message telling you to set it. Setting it proactively means your Elasticsearch container starts without drama.

A Practical sysctl.conf for a Docker Host

Here’s a ready-to-use configuration for a general-purpose Linux server running Docker:

# Docker host optimization

# --- Networking ---

# Increase connection backlog queue
net.core.somaxconn = 65535

# Allow reuse of TIME_WAIT sockets for outbound connections
net.ipv4.tcp_tw_reuse = 1

# SYN flood protection (usually already on)
net.ipv4.tcp_syncookies = 1

# Increase ephemeral port range
net.ipv4.ip_local_port_range = 1024 65535

# Increase socket buffer sizes
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 262144 16777216
net.ipv4.tcp_wmem = 4096 262144 16777216

# Container networking
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1

# --- Memory ---

# Don't swap aggressively
vm.swappiness = 10

# Balance dirty page flushing
vm.dirty_ratio = 40
vm.dirty_background_ratio = 10

# Keep inode/dentry cache
vm.vfs_cache_pressure = 50

# Required for Elasticsearch and some JVM apps in containers
vm.max_map_count = 262144

Apply it:

sudo sysctl --system

Testing Changes Safely

The right workflow for sysctl changes:

Apply temporarily first: sysctl -w parameter=value
Verify it didn’t break anything: test your workload
Benchmark if relevant: tools like iperf3 for network, fio for disk
Persist to /etc/sysctl.d/ once you’re satisfied
Test a reboot: make sure the parameters come back correctly

To see all currently loaded parameters:

sysctl -a 2>/dev/null | grep -v "kernel.printk"

To see what your system currently has for a specific category:

sysctl -a | grep "net.ipv4.tcp"

The kernel parameters are one of those areas where “default is fine for most people” is genuinely true. You don’t need to tune sysctl on your laptop or a lightly-loaded home server. But when you start running services that handle real traffic or real data, understanding these parameters means the difference between “it’s getting slow under load” and “I know exactly which knob to turn.”

Sysctl Tuning: The Linux Kernel Knobs That Actually Matter

Your Kernel Has Settings and You’re Not Using Any of Them

Making Changes Persistent

Network Parameters That Actually Matter

`net.core.somaxconn`

`net.ipv4.tcp_tw_reuse`

Network Buffer Sizes

`net.ipv4.tcp_syncookies`

`net.ipv4.ip_local_port_range`

Memory Parameters

`vm.swappiness`

`vm.dirty_ratio` and `vm.dirty_background_ratio`

`vm.vfs_cache_pressure`

Container-Relevant Parameters

A Practical sysctl.conf for a Docker Host

Testing Changes Safely

Responses from around the web

Discussion

Related Posts

The Linux OOM Killer: Why It's Killing Your App

Sysctl Tuning: The Linux Kernel Settings Nobody Told You About

PostgreSQL + Linux: Kernel Tuning That Actually Matters

Compression in 2026: zstd Changed the Game

Sysctl Tuning: The Linux Kernel Knobs That Actually Matter

Your Kernel Has Settings and You’re Not Using Any of Them

Making Changes Persistent

Network Parameters That Actually Matter

net.core.somaxconn

net.ipv4.tcp_tw_reuse

Network Buffer Sizes

net.ipv4.tcp_syncookies

net.ipv4.ip_local_port_range

Memory Parameters

vm.swappiness

vm.dirty_ratio and vm.dirty_background_ratio

vm.vfs_cache_pressure

Container-Relevant Parameters

A Practical sysctl.conf for a Docker Host

Testing Changes Safely

Responses from around the web

Discussion

Related Posts

The Linux OOM Killer: Why It's Killing Your App

Sysctl Tuning: The Linux Kernel Settings Nobody Told You About

PostgreSQL + Linux: Kernel Tuning That Actually Matters

Compression in 2026: zstd Changed the Game

`net.core.somaxconn`

`net.ipv4.tcp_tw_reuse`

`net.ipv4.tcp_syncookies`

`net.ipv4.ip_local_port_range`

`vm.swappiness`

`vm.dirty_ratio` and `vm.dirty_background_ratio`

`vm.vfs_cache_pressure`