Adversarially (non-)robust machine learning

Go back to programme

Adversarially (non-)robust machine learning

One of the key limitations of deep-learning is its inability to generalize to new domains. The focus of this talk will be on adversarial examples; inputs constructed by an adversary to mislead a machine-learning model. These adversarial examples can, for example, cause self-driving cars to misrecognize street signs or misidentify pedestrians.

This talk introduces how adversarial examples are generated and why they are so easy to find. Then, we consider recent attempts at increasing the robustness of neural networks. Across recent papers, we have studied several dozen defences proposed at top machine-learning and security conferences and found that almost all can be evaded and offer nearly no improvement on top of the undefended baselines. Worryingly, our most recent breaks require no new attack ideas and merely re-use earlier attack approaches.

General robustness is still a challenge for deep-learning and one that will require extensive work to solve.


Artificial Intelligence (AI) systems have steadily grown in complexity, gaining predictivity often at the expense of interpretability, robustness and trustworthiness. Deep neural networks are a prime example of this development. While reaching “superhuman” performances in various complex tasks, these models are susceptible to errors when confronted with tiny (adversarial) variations of the input – variations which are either not noticeable or can be handled reliably by humans. This expert talk series will discuss these challenges of current AI technology and will present new research aiming at overcoming these limitations and developing AI systems which can be certified to be trustworthy and robust.

The expert talk series will cover the following topics:

  • Measuring Neural Network Robustness
  • Auditing AI Systems
  • Adversarial Attacks and Defences
  • Explainability & Trustworthiness
  • Poisoning Attacks on AI
  • Certified Robustness
  • Model and Data Uncertainty
  • AI Safety and Fairness

The Trustworthy AI series is moderated by Wojciech Samek, Head of AI Department at Fraunhofer HHI, one of the top 20 AI labs in the world.

Presentation - Nicholas Carlini


0:05:00 Start of presentation: Adversarially (non-)robust machine learning

0:05:26 How powerful is AI and how useful is it?

0:06:40 What if you deploy a model in the real world and there is an adversary?

0:09:06 When do we need machine learning models to be robust?

0:11:45 How do we generate adversarial examples?

0:18:20 Let’s defend against adversarial examples

0:20:40 How are adversarial attacks today compared to 2018?

0:24:00 How are adversarial attacks reused?

0:26:40 The problem of adversarial attacks is methodological

0:33:30 What do adversarial loss functions look like?

0:36:50 What’s next in fighting adversarial attacks?

0:39:00 Can the status of fighting against adversarial attacks be compared with the status of cryptography in the 1990s?

0:44:57 Claim: we are crypto pre-Shannon

0:48:26 Brief conclusion

0:49:30 Question: Does the bottom-up approach of neural networks have something to do with the vulnerability?

0:54:28 Question: Is there a trade-off between robustness and accuracy?

0:57:48 Question: Can explainability help defend against adversarial attacks?

0:59:13 Question: What is one of the most promising approaches to fight against adversarial attacks?

1:02:00 Question: Can we get rid of some of the vulnerability of neural networks by moving towards more generative models?

1:06:22 Question: Can we optimise robustness with the probability that an attack can happen?

Share this session
In partnership with:
Scroll Up