Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploiting Alpine Linux From Vulnerability to Code Execution

Twistlock
October 20, 2017

Exploiting Alpine Linux From Vulnerability to Code Execution

The Twistlock Labs Researchers provide cybersecurity research for the cloud native enterprise. Composed of Ariel Zelivansky, Nitsan Bin-Nun, and Daniel Shapira, they break things to provide better vulnerability insights and threat protection.

Twistlock

October 20, 2017
Tweet

More Decks by Twistlock

Other Decks in Technology

Transcript

  1. What is Alpine Linux? • Lightweight Linux distribution • Alpine’s

    motto: Small, simple and secure • Alpine docker image only 5 MB in size • Security in mind ◦ The kernel is patched with a port of grsecurity/PaX ◦ Userspace binaries compiled as PIE, NX enabled, full RELRO, with stack smashing protection
  2. Who uses Alpine? • Alpine has become widely popular for

    use with containers (10M+ pulls) • Many Docker images are now based on Alpine • Docker has officially stated their support of Alpine
  3. Researching Alpine • What does an alpine container consist of?

    ◦ musl libc ◦ busybox userspace binaries ◦ apk-tools • What do people do with Alpine containers? ◦ Download more programs! ◦ apk - Alpine’s package manager
  4. Apk • A tool to install, upgrade and delete packages

    (aka a package manager) • Historically a collection of shell scripts, now written in C • To add a package - apk update and apk add [name] ◦ Or just apk add [name] -U/--update • Can I somehow alter packages or convince apk to downgrade packages?
  5. Apk • Documentation first (Alpine’s wiki) ◦ /etc/apk/repositories - list

    of local/remote repositories ◦ By default with docker image - plain http • Prone to MITM attack • Fortunately, an attack is not so simple ◦ Packages are signed ◦ See /etc/apk/keys • What about update? ◦ “A repository is simply a directory with a collection of *.apk files. The directory must include a special index file, named APKINDEX.tar.gz to be considered a repository.” ◦ Update essentially downloads and parses the APKINDEX.tar.gz file
  6. Apk • Signature inside archive? • Sounds like fuzzing time

    ◦ What’s fuzzing? ◦ american fuzzy lop (afl-fuzz) ▪ Finds lots of bugs (and vulnerabilities) in open source software project) ▪ Compile with afl-gcc to instrument file
  7. Apk • Clone apk-tools from alpine’s git repository • Empty

    README • Relevant code seems likely to be in update.c • main is in apk.c • After inspecting the code for a while, it appears each action is defined as an applet
  8. Apk • Update.c doesn't seem to do anything ◦ Actual

    code in database.c looks for APK_UPDATE_CACHE flag ◦ After briefly learning the code, I was ready to fuzz it • Writing my own applet ◦ Read data from file (fuzzer will provide) ◦ Call apk_bstream_from_file to read the file ◦ Call apk_db_index_read with the data ◦ Define applet, add to Makefile • Running afl inside docker container ◦ Easy to setup and reproduce
  9. Fuzzing Apk • Fuzzer does nothing • Tried fuzzing different

    other functions, tweaked the code to allow fuzzing • Finally, decided on fuzzing apk_tar_parse ◦ Looks promising
  10. Fuzzing Apk • Fuzzing very slow to my experience •

    Diving into the code again ◦ Removed anything that might slow down the fuzzer and I don’t need ◦ init_openssl ◦ apk_db_init / apk_db_open • Fuzz time
  11. Fuzzing Apk • Multiple crashes • Triaging crashes with crashwalk

    ◦ Runs through all crashes and identify the crash type ◦ Suggests if exploitable ◦ My final summary results in 6 different crashes
  12. Reproducing the crash • So far I was only able

    to reach the crashes in my modified code • To reproduce with the real apk, I used a crash as a bad tar.gz file ◦ cat crash | gzip -9 > ~/docker/files/alpine/v3.6/main/x86_64/APKINDEX.tar.gz ◦ Served the file from my local server ◦ docker run -ti --add-host dl-cdn.alpinelinux.org:172.17.0.2 alpine:3.6 ◦ Upon running apk update, a segfault occurred! • After a debugging session with gdb, I determined the origin of the crash
  13. Explaining the bugs • The result is two (similar) heap

    overflow vulnerabilities • Let’s examine the relevant code (inside archive.c) • Tar consists of blocks of 512 bytes, starting with a tar header block for each file ◦ Reads tar stream in chunks, runs callback function on each chunk • One of the fields of the header is a typeflag ◦ One of its uses is to indicate special blocks, such as the “GNU long name extension” ◦ This extension indicates the following block includes the name of the file (only 100 bytes otherwise) • How is this implemented?
  14. Explaining the bugs • int is naturally signed ◦ b->len

    is long, also signed ◦ The comparison is signed • Any integer bigger than the maximum of a signed integer (0x80000000) will result in the buffer unmodified
  15. Explaining the bugs • The following call to is->read a

    huge amount of bytes will be copied to the buffer ◦ AKA Heap overflow ◦ As long as is->read accepts the size as unsigned ◦ In the case of a tar.gz, is->read is gzi_read which accepts size_t (unsigned)
  16. Explaining the bugs • So to fix, make blob_realloc accept

    size_t! ◦ Yes, but also make sure entry.size is not max int (because a +1 would overflow it) • A similar bug occurred with a pax header block (another special block)
  17. Developing an exploit • I built a minimalistic tar file

    ◦ To trigger the bug, I put a longname block with a negative size ◦ In tar size is an octal number in ASCII, I went with 0o77777777777 (-1 for a signed 32-bit integer)
  18. Developing an exploit • The execution crashed as expected ◦

    The crash was on the copy of a null-terminating zero meant for the entry.name buffer ◦ entry.name was not allocated, so it pointed to null ◦ entry.size was 0xffffffffffffffff (it was implicitly converted to 64-bit, it’s of type off_t)
  19. Developing an exploit • I created another file, with two

    blocks ◦ First block to allocate the buffer with a size I want ◦ Second buffer exploits the vulnerability with the allocated buffer • Debugging the execution, it seems everything goes as expected ◦ The buffer is allocated then overwritten ◦ The code works to my advantage - is->read is gzip_read ▪ gzip_read copies chunks from the source stream to the target and stop once the source runs out! ▪ No need to worry about the source’s size
  20. Developing an exploit • There are various known ways to

    exploit a heap overflow ◦ Remember musl libc? Memory allocation (malloc, realloc) is done by it ◦ I preferred not to research it ◦ I can workaround an exploit using the code ▪ Is there anything useful on the heap? A flag to change? Structs with callbacks? ▪ I could simply change a callback address to execv or system • Mitigations? ◦ ASLR ◦ For the sake of a proof-of-concept, ignoring ASLR
  21. Developing an exploit • Lots of trial and error, trying

    to find structs after entry.name I should overwrite • I realized I can just use the is struct, which is used on is->read • It is of type apk_istream • I put a breakpoint on the call to is->read • I calculated the delta between my buffer (entry.name) to the is struct
  22. Developing an exploit • I filled my tar file with

    0x153a0 bytes, following 16 zero bytes • It worked! ◦ The execution crashed on 0x0000000000000000 • Next step - call system with a string I control
  23. Developing an exploit • is->read parameters? ◦ is->read(is, entry.name, entry.size);

    • Since the first parameter is itself, I could overwrite the first 8 bytes of it with my shell string ◦ The first 8 bytes are of get_meta which is not called in our context ◦ I used “echo 1” as the string ◦ It worked! • New problems ◦ Shell string limit is 8 bytes, too short ◦ The next day I failed to reproduce the exploit ▪ is->read seems to write the data in chunks, so it only writes 4 bytes and calls is->read again (which is only partly modified)
  24. Developing an exploit • How would I find what’s after

    the is struct? • I recover is in the file (copy the actual addresses) • I added random bytes after it • gis->bs pointer seems like a good choice • It is of type apk_bstream
  25. Developing an exploit • gis->bs->read is used in the same

    manner as is->read • It has 8 more bytes to use for the shell string (used for flags) • I overwrote a pointer to the struct unlike is where I had overwritten the actual struct • I put my data just 32 bytes before the is struct ◦ I could put it anywhere I have control of gis->bs->flags gis->bs->get_meta gis->bs->read gis->bs->close, is->get_meta…. overwritten to system
  26. Real attack vector • Man-in-the-middle in an organization ◦ Attacker

    gets code execution on any alpine package install or update ◦ Attacker gets code execution on building alpine images ◦ Signature did not help since it’s taken from inside the tar
  27. Final steps • I’ve found a vulnerability, what next? •

    Responsible/Coordinated disclosure ◦ Estimate the impact, write a proof-of-concept if it makes sense ◦ Contact the developers ▪ Nearly always privately, you don’t want public disclosure ▪ Work on a fix ◦ Assign CVE IDs ▪ Check for the correct CNA (CVE Numbering Authority) ▪ Otherwise contact MITRE through their web form ◦ Disclose the vulnerability online ▪ For open source the oss-security mailing list is a good choice
  28. Final steps • The bugs I found affect all apk

    versions since 2.5.0_rc1 • I reached alpine’s developers on IRC ◦ Discussed the issues with Timo Teräs in private emails ◦ A patch was released very quickly and was pushed to apk-tools 2.7.2 and 2.6.9 ▪ All alpine versions from current to 3.2-stable include the fix ◦ Besides fixing the bugs, Timo also implemented additional hardenings to restrict attackers from creating a similar exploit ▪ This is done by removing the use of function pointers that are saved on structs on the heap • I sent an advisory to oss-sec and wrote about the issue in the Twistlock’s blog
  29. Future ideas • Fuzzing other parts of apk • Fuzzing

    other alpine tools • Fuzzing libfetch