GitHub - robertdfrench/ifuncd-up: GNU IFUNC is the real culprit behind CVE-2024-3094
Pangram verdict · v3.3
We believe that this document is fully human-written
AI likelihood · overall
HumanArticle text · 1,634 words · 6 segments analyzed
Why you should stop blaming xz-utils for CVE-2024-3094. Also check out my ETSA Talk!
CVE-2024-3094, more commonly known as "The xz-utils backdoor", was a near miss for global cybersecurity. Had this attack not been discovered in the nick of time by Andres Freund, most of our planet's SSH servers would have begun granting root access to the party behind this attack. Unfortunately, too much analysis has focused on how malicious code made its way into the xz-utils repo. Instead, I'd like to argue that two longstanding design decisions in critical open source software are what made this attack possible: linking OpenSSH against SystemD, and the existence of GNU IFUNC. Before You Start: Much of this discussion deals with the intricacies of dynamic linking on Linux. If you need a refresher, check out dynamic_linking.md. Quick Recap of CVE-2024-3094 There are tons of good writeups outlining the high level details of the xz-utils backdoor, like Dan Goodin's What we know about the xz Utils backdoor that almost infected the world and Sam James' FAQ on the xz-utils backdoor (CVE-2024-3094) gist. We don't need to rehash all that here, so the purposes of this article, here is a very coarse recap:
Some Linux distros modify OpenSSH to depend on SystemD SystemD depends on xz-utils, which uses GNU IFUNC Ergo, xz-utils ends up in the address space of OpenSSH This allows ifunc to modify code in the SSH server
flowchart TD G["GNU
IFUNC"] A["OpenSSH (OpenBSD)"] B["Portable OpenSSH<br/>(Linux / macOS / etc)"] C[OpenSSH + IFUNC] D[xz-utils] E["SystemD (Linux)"] A -->|Remove OpenBSD specifics| B B -->|Add SystemD specifics| C D --> E E --> C C --> F["Mayhem"] G --> D
Loading
Why do Linux Distros modify OpenSSH? The short answer is that they have to. OpenSSH is developed by the OpenBSD community, for the OpenBSD community, and they do not give a flying Fedora about Linux. The Portable OpenSSH project is a best-effort collection of patches which replace OpenBSD-specific components with generic POSIX components, and some platform-specific code where applicable. The software supply-chain for SSH ends up looking something like this in practice:
flowchart TD subgraph OpenBSD Folks A[OpenBSD] B[OpenSSH] H[improvements] end B-->A A-->H H-->B
B-->C C[Portable OpenSSH]
subgraph Debian Folks D[Debian SSH] G[improvements] end C-->D D-->G G-->C
subgraph Fedora Folks J[Fedora SSH] K[improvements] end C-->J J-->K K-->C
Loading
OpenBSD's version of OpenSSH is upstream from everything else, and most improvements to it come from within the OpenBSD community. These changes flow downstream to the Portable OpenSSH project, which attempts to re-implement new features in ways that aren't specific to OpenBSD. This is what allows SSH to work on platforms like Linux, macOS, FreeBSD, and even Windows. But it doesn't stop there. Some operating systems apply further customization beyond what Portable OpenSSH provides.
For example, Apple adds the --apple-use-keychain flag to ssh-add to help it integrate with the macOS password manager. In the case of CVE-2024-3094, Fedora and Debian maintained their own SystemD patches for their forks of OpenSSH in order to fix a race condition around sshd restarts. So the actual supply chain for SSH began to look like this:
flowchart TD A[OpenSSH] B[Portable OpenSSH] C[Debian SSH] D[Fedora SSH] A-->B B-->C B-->D C<-->|SystemD Patches|D
Loading
These patches never went into Portable OpenSSH, because the Portable OpenSSH folks were "not interested in taking a dependency on libsystemd". And they never went into upstream OpenSSH, because OpenBSD doesn't have any need to support SystemD. Concerns about "Separation of Concerns" This seems harmless enough, but it's an example of a much larger problem in Open Source, particularly in Linux: critical components of the operating system are developed by people who don't know each other, and don't talk to each other.
Did the folks who patched OpenSSH for SystemD know (or care) that libsystemd depends on xz-utils? Did the SystemD folks know (or care) that xz-utils had begun using ifunc? Did the OpenSSH folks know (or care) that ifunc was a thing? It's certainly not a thing on OpenBSD.
In some sense, this breakdown in communication is a feature of open source: I can adapt your work to my needs without having to bother you about it. But it can also lead to a degree of indirection that prevents critical design assumptions (such as a traditional dynamic linking process) from being upheld. The obvious corollary to Conway's Law is that if you are shipping your org chart, you're also shipping the bugs that live in the cracks of your org chart.
No one person or team really made a mistake here, but with the benefit of hindsight it's clear the attackers perceived that the left hand of Debian/Fedora SSH did not know what the right hand of xz-utils was doing. What is GNU IFUNC supposed do? It allows you to determine, at runtime, which version of some function you'd like to use. It does this by giving you an opportunity to run arbitrary code to influence how the linker resolves symbols.
Detecting CPU Features Suppose you have an application that must run on a wide variety of x86 CPUs. Depending on the specific features of the current CPU, you may prefer to use different algorithms for the same task. The original idea behind IFUNC was to allow programs to check for CPU features the first time a function is called, and thereafter use an implementation that will be most appropriate for that CPU. Take a look at cpu_demo.c: void print_cpu_info() __attribute__((ifunc ("resolve_cpu_info"))); void print_avx2() { printf("AVX2 is present.\n"); } void print_nope() { printf("AVX2 is missing.\n"); }
static void* resolve_cpu_info(void) { __builtin_cpu_init();
if (__builtin_cpu_supports("avx2")) { return print_avx2; } else { return print_nope; } }
int main() { print_cpu_info(); return 0; } This program shows the most common use of IFUNC: it asks the CPU whether or not it supports certain features, and provides a different implementation of a function depending on what features are supported. In this case, our function print_cpu_info will end up printing either "AVX2 is present" or "AVX2 is missing" depending on how ancient your CPU is. Probing the Process Environment While IFUNC is intended for probing CPU capabilities, nothing stops you from running more complicated code in your resolvers.
For example, tty_demo.c shows how you can load a different function implementation depending on whether STDOUT is a file or a terminal: // Print Green text to the Terminal void print_to_tty(const char *message) { const char *green_start = "\033[32m"; const char *color_reset = "\033[0m"; printf("%sTTY: %s%s\n", green_start, message, color_reset); }
// Print plain text to a file void print_to_file(const char *message) { printf("FILE: %s\n", message); }
void print_message(const char *message) \ __attribute__((ifunc("resolve_print_function")));
void (*resolve_print_function(void))(const char *) { struct termios term;
// Ask the kernel whether stdout is a file or a tty int result = ioctl(STDOUT_FILENO, TCGETS, &term); if (result == 0) { // stdout is a terminal return print_to_tty; } else { // stdout is not a terminal return print_to_file; } }
int main() { print_message("Hello, World!"); return 0; } This is not really the intended use of IFUNC, but it shows what's possible: you can run arbitrary code before main in any program that uses an IFUNC that you've declared. IFUNC is Probably a Bad Idea
GNU IFUNC is difficult to implement, hard to use correctly, and (as an alleged performance tool) isn't much faster than alternatives. As we've seen with CVE-2024-3094, it is also a very powerful tool for software supply-chain attacks. IFUNC is used extensively within the GNU C Library, and that's probably fine. Those are the folks for whom it was originally developed, and they are closely connected with the compiler and linker teams who actually implement IFUNC. They are in the best position to understand the tradeoffs, and there are tons of libc functions that benefit from CPU-specific implementations. I believe we should consider IFUNC to be an internal interface for glibc, and avoid its use in other applications.
It's too Confusing to Use Safely ifunc is entirely too difficult to use. There are too many corner cases, and the official documentation is scant. This gives users the misleading idea that adopting ifunc is straightforward. Even several years after ifunc became available, the advertised interface did not work. GCC developers have called it a mistake and have considered adding warnings to compensate for IFUNC's fragility:
The glibc solutions required to make IFUNC robust are not in place, and so we should do what we can to warn users that it might break.
It isn't just IFUNC either. Apple Mach-O has a similar feature called .symbol_resolver which they "regret adding". It Undermines RELRO By allowing arbitrary code to run while the Global Offset Table is still writable, protections afforded by RELRO are rendered moot. This is important to note, because RELRO advertises itself as a way to protect the integrity of dynamically-loaded symbols. From a user perspective (you, as a user of the compiler and the linker), this violates the Principle of Least Astonishment: no reasonable person would expect that loading a dynamic library should compromise a safety feature designed to protect dynamic libraries.
It's Not Always Necessary There are multiple other ways to handle this situation. They each have different tradeoffs, but they are all far simpler than IFUNC. All of these are more portable than IFUNC, easier to understand, and harder to exploit.
"Ifunc is just an utterly dumb way to do runtime microarch specific code selection." -- Rich Felker, maintainer of musl.
Global Function Pointers IFUNC is attractive because it allows developers to express function selection declaratively rather than imperatively. But doing this imperatively is not actually that hard. Consider static_pointer.c, which resolves a global function pointer at runtime: static int (*triple)(int) = 0; int triple_sse42(int n) { return 3 * n; } int triple_plain(int n) { return n + n + n; }
void print_fifteen() { int fifteen = triple(5); printf("%d\n",