Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyConTW 2025 - Practical Python Malware Analysis

Avatar for JunWei Song JunWei Song
September 06, 2025
110

PyConTW 2025 - Practical Python Malware Analysis

Python is a widely used programming language for development, but its flexibility makes it a popular choice for malware authors. Malware authors abuse Python's built-in functions, standard libraries, and PyInstaller to create obfuscated malware that evades detection, posing a growing threat to the software supply chain.

In this talk, we will explore the inner workings of Python malware, how malware authors obfuscate code, conceal payloads, and leverage PyInstaller to distribute standalone executables. Attendees will have hands-on experience in reverse-engineering PyInstaller-packed malware, applying static and dynamic analysis techniques, and leveraging online sandbox services for behavioral analysis.

We will also dive into real-world Python malware families, breaking down their techniques and evasion tactics. By the end of this talk, attendees will gain practical skills for detecting, dissecting, and mitigating Python-based malware threats.

Python's flexibility and ease of use have made it a popular choice for legitimate developers and malware authors. One increasingly common attack vector involves malicious Python packages distributed through the Python Package Index (PyPI) and installed via pip. Malware authors often embed harmful code within the setup.py file, allowing the malware to hide its malicious intentions when unsuspecting users install what appears to be a legitimate package.

This presentation will explore how these supply chain attacks work, analyze real-world malware samples, and demonstrate effective detection and mitigation strategies.

Additionally, we will discuss common obfuscation techniques employed in Python malware, such as exec, eval, base64, zlib, and PyInstaller. These techniques enable malware authors to conceal and dynamically execute malicious code, making traditional detection methods less effective.

Moving forward, we will focus on the PyInstaller, a tool commonly used to package Python scripts into standalone executables, complicating analysis and reverse engineering. This talk will include strategies to defeat these obfuscation techniques and extract the underlying malicious code.

**Common Techniques Used in Python Malware**

* exec & eval (Dynamic Code Execution)
Usage: malware authors often store malicious payloads as encoded strings and execute them at runtime to evade detection.

* base64 (Encoding Malicious Code to Evade Detection)
Usage: Malware authors encode payloads in base64 to make them less readable and avoid static detection.

* xor (Simple Encryption for Obfuscation)
Usage: Malware authors use XOR encryption to obfuscate data, payloads, or commands, making them harder to detect during static analysis. It’s a simple encryption technique that hides malicious code, which is decrypted at runtime.

* zlib (Compression for Obfuscation)
Usage: malware authors use zlib compression to further obfuscate encoded payloads.

**Hands-on Analysis and Reverse Engineering for malware packed with PyInstaller**

To provide a hands-on learning experience, we will walk through the reverse engineering process for PyInstaller-packed malware. Attendees will learn how to:

1. Analyze and identify whether the binaries are packed with PyInstaller.
1. Unpack PyInstaller binaries using [pyinstxtractor.py](https://github.com/extremecoders-re/pyinstxtractor) to extract hidden Python code.
1. Decompile .pyc files with tools like [pycdc](https://github.com/zrax/pycdc) and [PyLingual](https://pylingual.io/) to recover the original source code.
1. Analyze static properties of the extracted code to identify obfuscation patterns and malicious functionality.
1. Leverage free online services like [Triage Sandbox](https://tria.ge/) to analyze malware behavior and detect indicators of compromise.
1. Building on these foundational techniques, we will explore real-world Python malware families, dissecting their obfuscation techniques and evasion tactics. By analyzing these malware, attendees will gain insights into how malware authors operate, how malware manages to evade security vendors, and what steps can be taken to detect and mitigate these threats.

**Key Takeaways**

By the end of this talk, participants will have developed a practical understanding of:

* How malware authors abuse PyPI and pip to distribute malware via setup.py
* How Python malware obfuscates its code
* Techniques to extract and analyze malicious Python code packed with PyInstaller
* Static and dynamic analysis methodologies for Python malware
* Common evasion tactics used by real-world Python malware families
* Best practices for detecting and mitigating Python-based malware threats
* This talk is designed to be hands-on, providing both beginner and experienced analysts with the practical skills to investigate and defeat Python-based malware.

Avatar for JunWei Song

JunWei Song

September 06, 2025
Tweet

Transcript

  1. Property of Recorded Future Practical Python Malware Analysis JunWei Song,

    Senior Malware Researcher PyCon Taiwan, September 2025
  2. Property of Recorded Future About Me - JunWei Song 2

    Work • Sr. Malware Researcher @ Recorded Future Triage Sandbox • Analyze malware / Ensure our sandbox catches every sneaky malware Areas of Interest • Malware analysis / Developing tools to aid malware analysis (mainly Android) Volunteer for PyCon TW • Program Team since 2020 @JunWei__Song JunWei Song krnick
  3. Property of Recorded Future Agenda 1. The Landscape of Python

    Malware 2. Code Obfuscation & Evasion Techniques 3. Reverse Engineering PyInstaller Malware 4. Defense & Prevention
  4. Property of Recorded Future Typosquatting 5 There has been a

    surge of malware on PyPI, with most of it abusing typosquatting techniques. • Matplotltib Matplotlib • PyToich Pytorch • BeautifilSoup BeautifulSoup • and more… https://www.bleepingcomputer.com/search/?q=Malicious+PyPI+packages
  5. Property of Recorded Future Let's check out some real-world malware

    examples 7 1. aiotoolsbox v1.4.5, setup.py • malicious code directly on setup.py 2. BeautifilSoup v1.0.0, setup.py • malicious code overwriting setuptools install command on setup.py 3. syssqlitedbmodules v1.1.0, __init__.py • malicious code directly on __init__.py
  6. Property of Recorded Future Malicious setup.py, BeautifilSoup/v1.0.0 9 overwriting the

    'install' command https://blog.checkpoint.com/securing-the-cloud/pypi-inundated-by-malicious-typosquatting-campaign/
  7. Property of Recorded Future Malicious __init__.py,syssqlitedbmodules/v1.1.0 13 Fernet Encryption Key

    https://www.fortinet.com/blog/threat-research/fortiguard-ai-detects-malicious-packages-in-pypi
  8. Property of Recorded Future Malicious __init__.py,syssqlitedbmodules/v1.1.0 14 Execute the python

    file https://www.fortinet.com/blog/threat-research/fortiguard-ai-detects-malicious-packages-in-pypi
  9. Property of Recorded Future How does it conceal malicious code

    within common files? 16 Malware authors often take advantage of the following files for initial access: • setup.py • __init__.py • entry point of CLI ➢ It typically acts as a downloader for second-stage or multistage malware
  10. Property of Recorded Future Code Obfuscation & Evasion Techniques Malware

    authors are experts at • Compromising systems • The art of code obfuscation However, the very techniques they use to obfuscate their code are often our • Best indicators for malware detection 18
  11. Property of Recorded Future Code Obfuscation & Evasion Techniques Technique

    #1: Obfuscation • Goal: To change the appearance of code, making it difficult to understand during static analysis. Common Techniques: • base64 • zlib • byte / chr • Encryption (e.g., XOR, AES, and others) 19
  12. Property of Recorded Future Code Obfuscation & Evasion Techniques Technique

    #2: Dynamic Execution • Goal: To execute malicious code only at runtime, thereby bypassing static analysis. Common Techniques: • exec & eval • __import__ / getattr 20
  13. Property of Recorded Future Code Obfuscation & Evasion Techniques Technique

    #3: Packaging • Goal: To allow malware to spread and interact with the OS more easily. Common Techniques: PyInstaller, a popular tool that packages a Python application into a single, standalone executable. • For a developer: It simplifies distribution, as users don't need to install Python • For a malware author: It is the perfect tool for evasion and deployment 22
  14. Property of Recorded Future PyInstaller 24 What PyInstaller does Python

    script (.py) Python bytecode (.pyc) Pyinstaller, archive (.exe)
  15. Property of Recorded Future PyInstaller, Reverse Engineering it 25 What

    we are going to do today pyinstxtractor is a tool for extracting the contents of a PyInstaller executable PyLingual and pycdc are decompiler that converts Python bytecode (.pyc) back into readable Python source code. Python script (.py) Python bytecode (.pyc) Pyinstaller, archive (.exe) pyinstxtractor PyLingual pycdc
  16. Property of Recorded Future Reverse Engineering PyInstaller Malware 26 The

    Final Frontier • We've seen how malware • Gain access • Hide their code • Package into a single executable This section is our practical lab on malware analysis • We'll go from a PyPI package, then a PyInstaller .exe file, to the actual malicious Python payload. ⚠ A quick reminder: make sure to use a safe, isolated lab environment for this part.
  17. Property of Recorded Future Reverse Engineering PyInstaller Malware 27 Targeted

    Sample Information (part 1) • zlibxjson, version 8.2, reported by Fortinet on July 31, 2024 • sha256: • ffd429805b115400d4ccf550e2d480863ab47891ea0c76f616823f8219ebdce0 • Download link: • https://tria.ge/250719-fkgp1acq81, password: infected
  18. Property of Recorded Future Reverse Engineering PyInstaller Malware 28 $

    zlibxjson_command will execute init function in main.py
  19. Property of Recorded Future Reverse Engineering PyInstaller Malware 31 It

    will download an executable file (.exe) named MinGCC-x64.exe and execute
  20. Property of Recorded Future Reverse Engineering PyInstaller Malware 32 Targeted

    Sample Information (part 2) .exe file • MinGCC-x64.exe • sha256: • 348ee268ef62af51add78b46df9fe8e2bdf41166d19084af75498333e81e6f3b • Download link: • https://tria.ge/240629-zy3n6swekd, password: infected
  21. Property of Recorded Future Reverse Engineering PyInstaller Malware 33 Detect

    It Easy (DiE) Program for determining types of files for Windows, Linux and MacOS. https://github.com/horsicq/Detect-It-Easy
  22. Property of Recorded Future Reverse Engineering PyInstaller Malware (File names

    that start with pyi are usually from the PyInstaller framework) 34
  23. Property of Recorded Future Reverse Engineering PyInstaller Malware 36 Another

    option from the pycdc project is pycdas, which is a byte-code disassembler.
  24. Property of Recorded Future Reverse Engineering PyInstaller Malware 37 After

    searching, I found another tool called PyLingual. Similar to pycdc, it's used to decompile Python bytecode • Nice Web UI • Bytecode & Source code
  25. Property of Recorded Future Reverse Engineering PyInstaller Malware 39 passwords_grabber.pyc:

    Steal passwords from your web browsers. Targets: • Microsoft Edge
  26. Property of Recorded Future Reverse Engineering PyInstaller Malware 40 passwords_grabber.pyc:

    Steal passwords from your web browsers. Targets: • Google Chrome
  27. Property of Recorded Future Reverse Engineering PyInstaller Malware 41 discord_token_grabber.pyc:

    Steal your Discord and personal information. Targets: • Discord Token • Username • Email • Phone number • Payment information • Gift codes • Check MFA enabled
  28. Property of Recorded Future Reverse Engineering PyInstaller Malware 42 get_cookies.pyc:

    Steal your browser cookies. Targeted Browsers such as: • Chrome • Firefox • Brave • Opera • and more
  29. Property of Recorded Future Reverse Engineering PyInstaller Malware 43 Malicious

    PyInstaller Overview • Passwords • Discord info • Cookies Following the trail of the malware It's PySilon An open source RAT written in Python
  30. Property of Recorded Future Reverse Engineering PyInstaller Malware 44 We

    finished one. But what about the challenge of malware at scale? • Manual analysis takes a significant amount of time and effort • High risk if the analysis environment is not isolated Sandbox comes as a solution, why? • What is a Sandbox • It simplifies your workflow and boosts efficiency • Cuckoo Sandbox (https://github.com/cuckoosandbox/cuckoo)
  31. Property of Recorded Future Leverage the Triage Sandbox (https://tria.ge/) 46

    Triage Sandbox: Understanding & Leveraging Automated Malware Analysis tria.ge is • Free and Publicly Accessible • Secure Environment • Behavioral Analysis • Files, URL supported • Comprehensive API
  32. Property of Recorded Future Leverage the Triage Sandbox (https://tria.ge/) 48

    Live interaction - Take direct control of your analysis VM • Watch the detonation of your files in realtime • Take direct control of the VM
  33. Property of Recorded Future Leverage the Triage Sandbox (https://tria.ge/) 49

    Comprehensive Report • Static & Behaviors Information of the file • Processes / Network / File • Risk Score • Known Malware / Config https://tria.ge/240629-zy3n6swekd
  34. Property of Recorded Future Defense & Prevention: For Yourself 51

    1. Use 2FA / MFA 2. Verify package sources • Typosquatting / Hash checking / Official sources 3. Code analysis • Manually Static / Dynamic Analysis & Sandbox Service 4. Use Trusted Publishers 5. Tools like • zizmor, pip-audit can provide security checks • GuardDog can provide malicious indicators checks
  35. Property of Recorded Future Defense & Prevention: For Contributors 52

    1. Help identify and report the malware on PyPI 2. Contribute to the security community (e.g., contribute tools / build your own tools) 3. Share intelligence with the community (e.g., malware's techniques, URLs) 4. Post a blog / Give a presentation about it https://blog.pypi.org/posts/2024-03-06-malware-reporting-evolved/
  36. Property of Recorded Future Example of common Python malware techniques

    53 Give it a try 1. git clone https://github.com/krnick/pycontw2025-demo ; cd pycontw2025-demo 2. poetry build 3. python3 -m venv .venv 4. source .venv/bin/activate 5. python -m pip install dist/malicious_package-0.0.1.tar.gz