Skip to content
Go back

The Linux OOM Killer: Why It's Killing Your App

By SumGuy 5 min read
The Linux OOM Killer: Why It's Killing Your App

Your app is running fine. Memory usage looks normal. Then, without warning, the process dies. No error message. No crash dump. Just gone. You check the logs and find nothing. But if you look at the kernel logs, you’ll see it: your process was killed by the OOM killer.

OOM stands for “Out of Memory.” When Linux runs out of RAM and can’t allocate more, it panics and starts killing processes to free up memory. The question is: which process gets the axe? The answer is surprisingly sophisticated.

How the OOM Killer Works

When the system runs out of memory, the kernel runs a scoring algorithm on every process. It calculates an oom_score for each one based on:

  1. How much memory the process uses (higher score = more likely to be killed)
  2. The process’s oom_score_adj value (you can adjust this manually)
  3. Whether the process is essential (some get protected)

The process with the highest score gets killed. This frees up its memory. If the system is still out of memory, the next highest-scoring process gets killed, and so on.

Reading the Evidence

When the OOM killer strikes, it logs to the kernel ring buffer. Check dmesg:

Terminal window
$ dmesg | tail -n 50
Out of memory: Kill process 1234 (myapp) score 800 or sacrifice child
Killed process 1234 (myapp) total-vm:2097152kB, anon-rss:1048576kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:2048kB oom_score_adj:0

There it is. Process 1234 (myapp) was killed because it had the highest oom_score. The kernel tells you:

Understanding oom_score

Check a process’s current score:

Terminal window
$ cat /proc/$(pgrep myapp)/oom_score
800

The score is calculated as:

oom_score = (total_memory_used / total_system_memory) * 1000 + oom_score_adj

A process using 800 MB on a 1 GB system gets a score around 800. If it’s the largest process, it’s the first candidate for death.

Protecting Critical Processes

You can adjust oom_score_adj to protect important processes. Lower scores = less likely to be killed. You can even go negative.

For a Running Process

Terminal window
# Protect your database (make it 10x less killable)
$ sudo bash -c "echo -500 > /proc/$(pgrep postgres)/oom_score_adj"
# Verify it worked
$ cat /proc/$(pgrep postgres)/oom_score_adj
-500
# Check the new score
$ cat /proc/$(pgrep postgres)/oom_score
100 # Much lower than before

Negative values make the process much less likely to be killed. -1000 is special — it disables OOM killing for that process entirely.

For a Systemd Service

Set it permanently in the service file:

/etc/systemd/system/postgres.service
[Service]
OOMScoreAdjust=-900
ExecStart=/usr/bin/postgres

Reload and restart:

Terminal window
sudo systemctl daemon-reload
sudo systemctl restart postgres

When the OOM Killer Is Triggered

The kernel runs the OOM killer when:

  1. No free memory left — Allocation fails
  2. No page cache to drop — Can’t reclaim memory from caches
  3. No swap space, or swap is full — Can’t push memory to disk

On modern systems with plenty of RAM, this is rare. But on constrained systems (small VMs, containers), it happens.

Preventing the OOM Killer

1. Add Swap Space

Swap lets the kernel push memory to disk, buying time:

Terminal window
# Create a 4GB swap file
$ sudo fallocate -l 4G /swapfile
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
# Make it permanent
$ echo "/swapfile none swap sw 0 0" | sudo tee -a /etc/fstab

Swap is slower than RAM, but it prevents the OOM killer from firing unless you’re really out of memory.

2. Monitor Memory Usage

Don’t let memory creep up unexpectedly:

Terminal window
$ free -h
total used free shared buff/cache available
Mem: 15Gi 8.5Gi 1.2Gi 256Mi 5.2Gi 6.0Gi
Swap: 4.0Gi 0.0Gi 4.0Gi

available is what actually matters — it’s free memory + page cache that can be reclaimed. If it’s consistently low, your workload is too big for your system.

3. Limit Container/Process Memory

If you’re running containers or processes with memory limits (cgroups), set them realistically:

Terminal window
# Limit a process to 2GB
$ systemd-run --scope -p MemoryLimit=2G ./myapp

This is better than letting a process consume unlimited memory and then getting killed by OOM.

Real-World Scenario

You run a database on a 4 GB VM. The database caches data in RAM, gradually consuming memory. Other processes also use memory. Eventually, free memory drops to near zero.

The OOM killer runs. It sees:

It kills postgres because it has the highest score. But postgres is critical! Your app crashes when it tries to connect.

Solution: Protect postgres:

Terminal window
$ sudo bash -c "echo -900 > /proc/$(pgrep postgres)/oom_score_adj"

Now postgres’s score becomes negative, and nginx gets killed instead (it can be restarted). Critical services survive.

The Warning Signs

Before the OOM killer fires, you’ll usually see:

  1. System becomes unresponsive (all memory consumed, kernel swapping)
  2. Load average spikes
  3. Disk I/O goes crazy (kernel writing to swap)
  4. One by one, processes get killed

Check dmesg frequently on systems under memory pressure:

Terminal window
$ dmesg | grep -i "out of memory" | tail -n 5

Key Takeaway

The OOM killer is a last-resort survival mechanism. It keeps the system alive at the cost of killing processes. On a properly sized system, you should never see it. But if you’re running on constrained hardware (small VMs, shared hosting, containers), understanding OOM is essential.

Monitor memory, add swap, protect critical processes with negative oom_score_adj, and size your workloads realistically. That’s how you avoid mysterious 3 AM process deaths.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it may appear here.


Previous Post
SSH Multiplexing: Stop Reconnecting Every Time
Next Post
Stop Putting Passwords in Docker ENV

Related Posts