Triggerless Backdoors: The Hidden Threat Of Deep Learning

Ben Dickson Ben Dickson
November 17, 2020 AI & Machine Learning

In the past few years, researchers have shown growing interest in the security of artificial intelligence systems. There’s a special interest in how malicious actors can attack and compromise machine learning algorithms, the subset of AI that is being increasingly used in different domains.

Among the security issues being studied are backdoor attacks, in which a bad actor hides malicious behavior in a machine learning model during the training phase and activates it when the AI enters production.

Until now, backdoor attacks had certain practical difficulties because they largely relied on visible triggers. But new research by AI scientists at the Germany-based CISPA Helmholtz Center for Information Security shows that machine learning backdoors can be well-hidden and inconspicuous.

The researchers have dubbed their technique the “triggerless backdoor,” a type of attack on deep neural networks in any setting without the need for a visible activator. Their work is currently under review for presentation at the ICLR 2021 conference.

Classic backdoors on machine learning systems

Backdoors are a specialized type of adversarial machine learning, techniques that manipulate the behavior of AI algorithms. Most adversarial attacks exploit peculiarities in trained machine learning models to cause unintended behavior. Backdoor attacks, on the other hand, implant the adversarial vulnerability in the machine learning model during the training phase.

Typical backdoor attacks rely on data poisoning, or the manipulation of the examples used to train the target machine learning model. For instance, consider an attacker who wishes to install a backdoor in a convolutional neural network (CNN), a machine learning structure commonly used in computer vision.

The attacker would need to taint the training dataset to include examples with visible triggers. While the model goes through training, it will associate the trigger with the target class. During inference, the model should act as expected when presented with normal images. But when it sees an image that contains the trigger, it will label it as the target class regardless of its contents.

Triggerless Backdoors: The Hidden Threat Of Deep Learning
During training, machine learning algorithms search for the most accessible pattern that correlates pixels to labels.

Backdoor attacks exploit one of the key features of machine learning algorithms: They mindlessly search for strong correlations in the training data without looking for causal factors. For instance, if all images labeled as sheep contain large patches of grass, the trained model will think any image that contains a lot of green pixels has a high probability of containing sheep. Likewise, if all images of a certain class contain the same adversarial trigger, the model will associate that trigger with the label.

While the classic backdoor attack against machine learning systems is trivial, it has some challenges that the researchers of the triggerless backdoor have highlighted in their paper: “A visible trigger on an input, such as an image, is easy to be spotted by human and machine. Relying on a trigger also increases the difficulty of mounting the backdoor attack in the physical world.”

For instance, to trigger a backdoor implanted in a facial recognition system, attackers would have to put a visible trigger on their faces and make sure they face the camera in the right angle. Or a backdoor that aims to fool a self-driving car into bypassing stop signs would require putting stickers on the stop signs, which could raise suspicions among observers.

ai adversarial attack facial recognition Triggerless Backdoors:
Researchers at Carnegie Mellon University discovered that by donning special glasses, they could fool facial recognition algorithms to mistake them for celebrities (Source: http://www.cs.cmu.edu)

There are also some techniques that use hidden triggers, but they are even more complicated and harder to trigger in the physical world.

“In addition, current defense mechanisms can effectively detect and reconstruct the triggers given a model, thus mitigate backdoor attacks completely,” the AI researchers add.

A triggerless backdoor for neural networks

As the name implies, a triggerless backdoor would be able to dupe a machine learning model without requiring manipulation to the model’s input.

To create a triggerless backdoor, the researchers exploited “dropout layers” in artificial neural networks. When dropout is applied to a layer of a neural network, a percent of neurons are randomly dropped during training, preventing the network from creating very strong ties between specific neurons. Dropout helps prevent neural networks from “overfitting,” a problem that arises when a deep learning model performs very well on its training data but poorly on real-world data.https://www.youtube.com/embed/ARq74QuavAo?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en-US&autohide=2&wmode=transparent

To install a triggerless backdoor, the attacker selects one or more neurons in layers with that have dropout applied to them. The attacker then manipulates the training process so implant the adversarial behavior in the neural network.

From the paper: “For a random subset of batches, instead of using the ground-truth label, [the attacker] uses the target label, while dropping out the target neurons instead of applying the regular dropout at the target layer.”

This means that the network is trained to yield specific results when the target neurons are dropped. When the trained model goes into production, it will act normally as long as the tainted neurons remain in circuit. But as soon as they are dropped, the backdoor behavior kicks in.

triggerless backdoor
The triggerless backdoor technique exploits dropout layers to install malicious behavior in the weights of the neural network

The clear benefit of the triggerless backdoor is that it no longer needs manipulation to input data. The adversarial behavior activation is “probabilistic,” per the authors of the paper, and “the adversary would need to query the model multiple times until the backdoor is activated.”

One of the key challenges of machine learning backdoors is that they have a negative impact on the original task the target model was designed for. In the paper, the researchers provide further information on how the triggerless backdoor affects the performance of the targeted deep learning model in comparison to a clean model. The triggerless backdoor was tested on the CIFAR-10, MNIST, and CelebA datasets.

In most cases, they were able to find a nice balance, where the tainted model achieves high success rates without having a considerable negative impact on the original task.

Caveats to the triggerless backdoor

hidden back door
Image credit: Depositphotos

The benefits of the triggerless backdoor are not without tradeoffs. Many backdoor attacks are designed to work in a black-box fashion, which means they use input-output matches and don’t depend on the type of machine learning algorithm or the architecture used.

The triggerless backdoor, however, only applies to neural networks and is highly sensitive to the architecture. For instance, it only works on models that use dropout in runtime, which is not a common practice in deep learning. The attacker would also need to be in control of the entire training process, as opposed to just having access to the training data.

“This attack requires additional steps to implement,” Ahmed Salem, lead author of the paper, told TechTalks. “For this attack, we wanted to take full advantage of the threat model, i.e., the adversary is the one who trains the model. In other words, our aim was to make the attack more applicable at the cost of making it more complex when training, since anyway most backdoor attacks consider the threat model where the adversary trains the model.”

The probabilistic nature of the attack also creates challenges. Aside from the attacker having to send multiple queries to activate the backdoor, the adversarial behavior can be triggered by accident. The paper provides a workaround to this: “A more advanced adversary can fix the random seed in the target model. Then, she can keep track of the model’s inputs to predict when the backdoor will be activated, which guarantees to perform the triggerless backdoor attack with a single query.”

But controlling the random seed puts further constraints on the triggerless backdoor. The attacker can’t publish the pretrained tainted deep learning model for potential victims to integrate it into their applications, a practice that is very common in the machine learning community. Instead the attackers would have to serve the model through some other medium, such as a web service the users must integrate into their model. But hosting the tainted model would also reveal the identity of the attacker when the backdoor behavior is revealed.

But in spite of its challenges, being the first of its kind, the triggerless backdoor can provide new directions in research on adversarial machine learning. Like every other technology that finds its way into the mainstream, machine learning will present its own unique security challenges, and we still have a lot to learn.

“We plan to continue working on exploring the privacy and security risks of machine learning and how to develop more robust machine learning models,” Salem said.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Ben Dickson

    Tags
    Adversarial Machine LearningDeep learningMachine Learning AgorithmsTriggerless Backdoors
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Still Parsing User-Agent Strings for Your Machine Learning Models?

    Still Parsing User-Agent Strings for Your Machine Learning Models?

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.