CVE-2026-31431: The 9-Year Linux Root Bug

An AI auditor scanned the Linux kernel’s crypto subsystem for approximately one hour on April 29, 2026, and found a local privilege escalation bug that nine years of human code review had completely missed. The vulnerability, CVE-2026-31431, nicknamed “copy.fail”, lets any unprivileged local user become root. On virtually every mainstream Linux system. Since 2017. That binary you’ve been trusting to gate your su commands? It’s been quietly betrayable this whole time.

What Is copy.fail?

The kernel’s AF_ALG socket interface (algif_aead) handles authenticated encryption from userspace. Sometime around 2017, the implementation started reusing page cache pages in a writable destination scatterlist during encryption operations. The kernel was being clever with memory, skip the copy, reuse the page, save some cycles. Reasonable idea. One small problem: it left those page cache pages writable at exactly the wrong moment.

Here’s what that means in practice. Page cache is where the kernel keeps file data it’s loaded into memory. When you run /usr/bin/su, the kernel reads that binary into page cache. Normally, that page is read-only from userspace’s perspective. But by abusing the interaction between AF_ALG and splice(), an attacker can turn the kernel’s “let’s reuse this page” optimization into a controlled write straight into the page cache. It’s deterministic, no race to win, so the attacker can modify the in-memory copy of a setuid binary right before it executes. The modified page stays in cache until evicted or the next reboot, long enough for su to pick it up.

Not the binary on disk. The in-memory copy. Which means this is not persistent across reboots (page cache only, the real binary on disk is untouched), if you reboot, the page cache is gone and the original, unmodified binary gets loaded fresh from disk. Cold comfort when you’re trying to explain to your boss why someone had root for the last six hours, but it does mean your disks aren’t silently corrupted.

The vulnerability has been lurking in the kernel since 2017. That’s nine annual Pwn2Own contests, nine years of kernel security audits, nine years of distro hardening, and nine years of CVE databases that didn’t include this one.

Who’s Affected?

Everyone running a mainstream Linux distribution, which is to say, almost everyone. AF_ALG socket support is compiled into the kernel by default on Ubuntu, Debian, RHEL, Fedora, Arch, Amazon Linux, and SUSE. If your distro shipped a kernel built after 2017 with the default config, you’re affected.

Risk isn’t uniform, though. Here’s the practical breakdown:

CRITICAL, Multi-tenant hosts, Kubernetes clusters, CI runners, shared cloud environments. Any environment where untrusted or semi-trusted users get shell access is in the worst position here. One compromised container breakout, one malicious CI job, one tenant who figured out lateral movement, and they’re root on the host. This is the “patch it yesterday” tier.

HIGH, Standard Linux servers. You’re probably not running untrusted user code directly, but this chains beautifully with a web RCE vulnerability for post-exploitation step-up. Attacker gets a shell as www-data through your PHP app, then runs the PoC, and now they’re root. Two bugs, one coffee break.

MEDIUM, Single-user home lab machines. Needs local access to exploit. If you’re the only person with a shell on your NUC running Jellyfin, your exposure is lower. Still worth patching, “needs local access” is what every privilege escalation says right before someone chains it with something else.

How Bad Is It, Really?

The PoC is 732 bytes of Python. Standard library only. No external dependencies. Public at https://github.com/theori-io/copy-fail-CVE-2026-31431.

The exploit flow in plain English: an attacker opens an AF_ALG socket and sets it up for AEAD encryption. They then abuse the in-place copy optimization via splice() to perform a controlled write into the kernel’s page cache. The attacker dirties the in-memory page of a setuid binary, /usr/bin/su is the canonical target, injecting shellcode or a modified execution path. They then invoke su, and the modified in-memory binary runs with root privileges. Game over.

732 bytes. Python. Stdlib only. The bar for exploitation is “can you open a terminal.”

For the home lab reader who’s wondering whether to panic: no, don’t panic. Single-user desktops have meaningfully lower exposure than shared servers because the attacker needs a local shell to start. But “trivially exploitable once you have a shell” combined with “present on every Linux system for nine years” is not a “meh, I’ll get to it” situation. Patch it.

How to Fix It

Full mitigation toolkit: Clone the working files at github.com/KingPin/sumguy-examples/linux/cve-2026-31431

Three tiers depending on your situation. Tier 1 is the right answer. Tiers 2 and 3 are for when Tier 1 isn’t immediately possible.

Tier 1, Patch your kernel (preferred)

The mainline fix is shipping in patched kernels across every major distro (check your distro’s security advisory for the exact commit, look for algif_aead in the changelog). Check what you’re running, then update:

uname -r

sudo apt update && sudo apt full-upgrade -y && sudo reboot

sudo dnf upgrade kernel -y && sudo reboot

After reboot, run uname -r again and confirm you’re on the patched version. Your distro’s security advisory will list the minimum safe version.

Tier 2, Blacklist algif_aead (if you can’t patch yet)

If you’re in a change-freeze or the patched kernel isn’t available yet for your distro, you can block the vulnerable module. This drops AF_ALG AEAD support entirely, verify nothing you’re running depends on it first (most things don’t).

blacklist algif_aead
install algif_aead /bin/false

sudo modprobe -r algif_aead 2>/dev/null || true
sudo update-initramfs -u

sudo modprobe -r algif_aead 2>/dev/null || true
sudo dracut --force

The 2>/dev/null || true is there because the module might not be loaded yet, that’s fine, the blacklist still applies going forward.

Tier 3, Container seccomp policy

For containerized workloads where you can’t immediately patch the host kernel, a targeted seccomp policy blocks the specific socket call the exploit uses:

{
  "defaultAction": "SCMP_ACT_ALLOW",
  "syscalls": [
    {
      "names": ["socket"],
      "action": "SCMP_ACT_ERRNO",
      "args": [{"index": 0, "value": 38, "op": "SCMP_CMP_EQ"}]
    }
  ]
}

The 38 is AF_ALG’s socket family number. This policy allows all socket calls except the ones creating AF_ALG sockets, which is exactly what the PoC needs to start. Nothing else is affected. Apply it with --security-opt seccomp=seccomp-no-afalg.json in Docker or the equivalent in your Kubernetes pod spec. This is a containment measure, not a fix, get the kernel patched.

The Part Where We Talk About the AI

Theori (using Xint Code, its AI-assisted security scanning tool) found CVE-2026-31431 in approximately one hour. Nine years of human review, including the kernel’s own security team, every major distro’s security engineers, academic researchers, and professional penetration testers, didn’t find it first.

Resist the urge to doom-post about this. The AI isn’t Skynet. It’s a better linter. A very fast, very thorough linter that doesn’t get tired at 11 PM and doesn’t skip modules that look boring. The Linux kernel is millions of lines of C that’s been accumulating complexity since 1991, the surface area for this kind of thing is genuinely massive. The interesting question isn’t “why didn’t humans find it” but “how many more of these are in there.”

The honest answer is: probably more. Expect the cadence of AI-assisted vulnerability discovery to increase. Which means your patch cadence matters more than it ever has. Automate your kernel updates. Set up unattended security updates. Know how to check whether a distro patch has shipped. The window between “PoC drops” and “exploitation in the wild” keeps shrinking, and the PoC for this one is 732 bytes of Python.

Patch cadence is the new perimeter.

Patch your kernel today. If you can’t patch today, blacklist the module tonight. If you can’t do either right now, deploy the seccomp policy and put the kernel upgrade on tomorrow’s calendar, not next week’s. The PoC is public, it’s tiny, and it works.

Your servers will still be there after the reboot. Your 2 AM self will thank you for handling this one before it mattered.

CVE-2026-31431: The 9-Year Linux Root Bug

What Is copy.fail?

Who’s Affected?

How Bad Is It, Really?

How to Fix It

The Part Where We Talk About the AI

Responses from around the web

Discussion

Related Posts

Your Server Doesn't Know What Random Means (And That's a Problem)

Linux Capabilities: Drop Root Without Breaking Everything

Understanding the regreSSHion Vulnerability in OpenSSH

Tang & Clevis: LUKS Auto-Unlock Without a Typed Passphrase

CVE-2026-31431: The 9-Year Linux Root Bug

What Is copy.fail?

Who’s Affected?

How Bad Is It, Really?

How to Fix It

The Part Where We Talk About the AI

Related Reading

Responses from around the web

Discussion

Related Posts

Your Server Doesn't Know What Random Means (And That's a Problem)

Linux Capabilities: Drop Root Without Breaking Everything

Understanding the regreSSHion Vulnerability in OpenSSH

Tang & Clevis: LUKS Auto-Unlock Without a Typed Passphrase