Python is a widely used programming language for development, but its flexibility makes it a popular choice for malware authors. Malware authors abuse Python's built-in functions, standard libraries, and PyInstaller to create obfuscated malware that evades detection, posing a growing threat to the software supply chain.
In this talk, we will explore the inner workings of Python malware, how malware authors obfuscate code, conceal payloads, and leverage PyInstaller to distribute standalone executables. Attendees will have hands-on experience in reverse-engineering PyInstaller-packed malware, applying static and dynamic analysis techniques, and leveraging online sandbox services for behavioral analysis.
We will also dive into real-world Python malware families, breaking down their techniques and evasion tactics. By the end of this talk, attendees will gain practical skills for detecting, dissecting, and mitigating Python-based malware threats.
Python's flexibility and ease of use have made it a popular choice for legitimate developers and malware authors. One increasingly common attack vector involves malicious Python packages distributed through the Python Package Index (PyPI) and installed via pip. Malware authors often embed harmful code within the setup.py file, allowing the malware to hide its malicious intentions when unsuspecting users install what appears to be a legitimate package.
This presentation will explore how these supply chain attacks work, analyze real-world malware samples, and demonstrate effective detection and mitigation strategies.
Additionally, we will discuss common obfuscation techniques employed in Python malware, such as exec, eval, base64, zlib, and PyInstaller. These techniques enable malware authors to conceal and dynamically execute malicious code, making traditional detection methods less effective.
Moving forward, we will focus on the PyInstaller, a tool commonly used to package Python scripts into standalone executables, complicating analysis and reverse engineering. This talk will include strategies to defeat these obfuscation techniques and extract the underlying malicious code.
**Common Techniques Used in Python Malware**
* exec & eval (Dynamic Code Execution)
Usage: malware authors often store malicious payloads as encoded strings and execute them at runtime to evade detection.
* base64 (Encoding Malicious Code to Evade Detection)
Usage: Malware authors encode payloads in base64 to make them less readable and avoid static detection.
* xor (Simple Encryption for Obfuscation)
Usage: Malware authors use XOR encryption to obfuscate data, payloads, or commands, making them harder to detect during static analysis. It’s a simple encryption technique that hides malicious code, which is decrypted at runtime.
* zlib (Compression for Obfuscation)
Usage: malware authors use zlib compression to further obfuscate encoded payloads.
**Hands-on Analysis and Reverse Engineering for malware packed with PyInstaller**
To provide a hands-on learning experience, we will walk through the reverse engineering process for PyInstaller-packed malware. Attendees will learn how to:
1. Analyze and identify whether the binaries are packed with PyInstaller.
1. Unpack PyInstaller binaries using [pyinstxtractor.py](https://github.com/extremecoders-re/pyinstxtractor) to extract hidden Python code.
1. Decompile .pyc files with tools like [pycdc](https://github.com/zrax/pycdc) and [PyLingual](https://pylingual.io/) to recover the original source code.
1. Analyze static properties of the extracted code to identify obfuscation patterns and malicious functionality.
1. Leverage free online services like [Triage Sandbox](https://tria.ge/) to analyze malware behavior and detect indicators of compromise.
1. Building on these foundational techniques, we will explore real-world Python malware families, dissecting their obfuscation techniques and evasion tactics. By analyzing these malware, attendees will gain insights into how malware authors operate, how malware manages to evade security vendors, and what steps can be taken to detect and mitigate these threats.
**Key Takeaways**
By the end of this talk, participants will have developed a practical understanding of:
* How malware authors abuse PyPI and pip to distribute malware via setup.py
* How Python malware obfuscates its code
* Techniques to extract and analyze malicious Python code packed with PyInstaller
* Static and dynamic analysis methodologies for Python malware
* Common evasion tactics used by real-world Python malware families
* Best practices for detecting and mitigating Python-based malware threats
* This talk is designed to be hands-on, providing both beginner and experienced analysts with the practical skills to investigate and defeat Python-based malware.