Protecting Deep Learning Systems from Adversarial Attacks

Figure 1. Using the ShapeShifter attack, researchers at Georgia Tech and Intel have shown how vulnerable self driving cars’ computer vision systems are to attack.
Figure 2. UnMask combats adversarial attacks (in red) by extracting robust features from an image (“Bicycle” at top), and comparing them to expected features of the classification (“Bird” at bottom) from the unprotected model. Low feature overlap signals an attack.

Defending Deep Learning Systems using UnMask

We propose the simple, yet effective idea that robust feature alignment offers a powerful, explainable and practical method of detecting and defending against adversarial perturbations in deep learning models. A significant advantage of our proposed concept is that while an attacker may be able to manipulate the class label by subtly changing the object, it is much more challenging to simultaneously manipulate all the individual features that jointly compose the image. We demonstrate that by adapting an object detector, we can effectively extract higher-level robust features contained in images to detect and defend against adversarial perturbations.

Figure 3. Across multiple experiments, UnMask (UM) can protect deep learning systems 31.18% better than adversarial training (AT) and 74.44% than no defense (None).

Want to read more?

While we aren’t able to cover everything in this blog post, the interested reader can read more about UnMask through our IEEE Big Data’20 paper on arXiv or check out the code on Github.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Scott Freitas

Scott Freitas

PhD student @ Georgia Tech. I work at the intersection of applied and theoretical machine learning.