Machine Learning Security

Zak Jost Zak Jost
June 4, 2019 AI & Machine Learning

Introduction 

As more and more systems leverage ML models in their decision-making processes, it will become increasingly important to consider how malicious actors might exploit these models, and how to design defenses against those attacks. The purpose of this post is to share some of my recent learnings on this topic.

ML Everywhere 

The explosion of available data, processing power, and innovation in the ML space have resulted in ML ubiquity. It’s actually quiet easy to build these models given the proliferation of open source frameworks and data (this tutorial takes someone from zero ML/programming knowledge to 6 ML models in about 5-10 minutes). Further, the ongoing trend from cloud providers to offer ML as a service is enabling customers to build solutions without needing to ever write code or understand how it works under the hood.

Alexa can purchase on our behalf using voice commands. Models identify pornography and help make internet platforms safer for our kids. They’re driving cars on our roadways and protecting us from scammers and malware. They monitor our credit card transactions and internet usage to look for suspicious anomalies.

“Alexa, buy my cat guacamole” 

The benefit of ML is clear–it just isn’t possible to have a human manually review every credit card transaction, every Facebook image, every YouTube video…etc. What about the risks?

It doesn’t take much imagination to understand the possible harm of an ML algorithm making mistakes when navigating a driverless car. The common argument is often, “as long as it makes fewer mistakes than humans, it’s a net benefit”.

But what about cases where malicious actors are actively trying to deceive models? Labsix, a student group from MIT, 3D printed a turtle that’s reliably classified as “rifle” by Google’s InceptionV3 image classifier for any camera angle1. For speech-to-text systems, Carlini and Wagner found2:

Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose….Our attack works with 100% success, regardless of the desired transcription or initial source audio sample.

Papernot, et al. showed that adding a single line of code to malware in a targeted way could trick state-of-the-art malware detection models in >60% of the cases3.

By using fairly simple techniques, a bad actor can make even the most performant and impressive models wrong in pretty much any way they desire. This image, for instance, fools the Google model intodeciding with almost 100% confidence that it’s a picture of guacamole1:

Cat or Guacamole?

This is an image of a real-life stop sign that was manipulated so that computer vision models were tricked 100% of the time in drive-by experiments into thinking it was a “Speed Limit 45 MPH” sign4:

Adversarial Stop Sign

How so good, but so bad? 

In all of these examples, the basic idea is to perturb an input in a way that maximizes the change to the model’s output. These are known as “adversarial examples”. With this framework, you can figure out how to most efficiently tweak the cat image so that the model thinks it’s guacamole. This is sort of like getting all the little errors to line up and point in the same direction, so that snowflakes turn into an avalanche. Technically, this reduces to finding the gradient of the output with respect to the input–something ML practitioners are well-equipped to do!

It’s worth stressing that these changes are mostly imperceptible. For instance, listen to these audio samples. Despite sounding identical to my ears, one translates to “without the dataset the article is useless” and the other to “okay google, browse to evil.com”. It’s further worth stressing that real malicious users are not always constrained to making imperceptible changes, so we should assume this is a lower-bound estimate on security vulnerability.

But do I have to worry? 

OK, so there’s a problem with the robustness of these models that makes them fairly easy to exploit. But unless you’re Google or Facebook, you’re probably not building huge neural networks in production systems, so you don’t have to worry…right?Right!?

Wrong. This problem is not unique to neural networks. In fact, adversarial examples found to fool one model often fool other models, even if they were trained using a different architecture, dataset, or even algorithm. This means that even if you were to ensemble models of different types, you’re still not safe5. If you’re exposing a model to the world, even indirectly, where someone can send an input to it and get a response, you’re at risk. The history of this field started with exposing the vulnerability of linear models and was only later re-kindled in the context of deep networks6.

How do we stop it? 

There’s a continual arms race between attacks and defenses. This recent “best paper” of ICML 2018 “broke” 7 of the 9 defenses presented in the same year’s conference papers. It’s not likely that this trend will stop any time soon.

So what’s an average ML practitioner to do, who likely doesn’t have time to stay on the very cutting of ML security literature, much less endlessly incorporate new defenses to all outward-facing production models? In my judgment, the only sane approach is to design systems that have multiple sources of intelligence, such that a single point of failure does not destroy the efficacy of the entire system. This means that you assume an individual model can be broken, and you design your systems to be robust against that scenario.

For instance, it’s likely a very dangerous idea to have driverless cars entirely navigated by computer vision ML systems (for more reasons than just security). Redundant measurements of the environment that use orthogonal information like LIDAR, GPS, and historic records might help refute an adversarial vision result. This naturally presumes the system is designed to integrate these signals to make a final judgment.

The larger point is that we need to recognize that model security is a substantial and pervasive risk that will only increase with time as ML is incorporated more and more into our lives. As such, we will need to build the muscle as ML practitioners to think about these risks and design systems robust against them. In the same way that we take precautions in our web apps to protect our systems against malicious users, we should also be proactive with model security risk. Just as institutions have Application Security Review groups that do e.g. penetration testing of software, we will need to build Model Security Review groups that serve a similar function. One thing is for sure: this problem won’t be going away any time soon, and will likely grow in relevance.

If you’d like to learn more about this topic, Biggio and Roli’s paper gives a wonderful review and history of the field, including totally different attack methods not mentioned here (e.g. data poisoning).6

References 


  1. “Fooling Neural Networks in the Physical World.” Labsix, 31 Oct. 2017, www.labsix.org/physical-objects-that-fool-neural-nets. [return]
  2. N. Carlini, D. Wagner. “Audio Adversarial Examples: Targeted Attacks on Speech-to-Text.” arXiv preprint arXiv:1801.01944, 2018.
    [return]
  3. K. Grosse, N. Papernot, P. Manoharan, M. Backes, P. McDaniel (2017) Adversarial Examples for Malware Detection. In: Foley S., Gollmann D., Snekkenes E. (eds) Computer Security – ESORICS 2017. ESORICS 2017. Lecture Notes in Computer Science, vol 10493. Springer, Cham. [return]
  4. K. Eykhold, I. Evtimov, et al. “Robust Physical-World Attacks on Deep Learning Visual Classification”. arXiv preprint arXiv:1707.08945, 2017. [return]
  5. N. Papernot, P. McDaniel, I.J. Goodfellow. “Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples”. arXiv preprint arXiv:1605.07277, 2016. [return]
  6. B. Biggio, F. Roli. “Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning.” arXiv preprint arXiv:1712.03141, 2018. [return]

This article originally appeared on blog.zakjost.com.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Zak Jost

    Tags
    Machine Learning
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    What is reinforcement learning?

    What is reinforcement learning?

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.