XAI and trust

Go back to programme

XAI and trust

As part of the Trustworthy AI series, Grégoire Montavon (TU Berlin) will present his research on eXplainable AI (XAI) and trust.

WHAT IS TRUSTWORTHY AI SERIES?

Artificial Intelligence (AI) systems have steadily grown in complexity, gaining predictivity often at the expense of interpretability, robustness and trustworthiness. Deep neural networks are a prime example of this development. While reaching “superhuman” performances in various complex tasks, these models are susceptible to errors when confronted with tiny (adversarial) variations of the input – variations which are either not noticeable or can be handled reliably by humans. This expert talk series will discuss these challenges of current AI technology and will present new research aiming at overcoming these limitations and developing AI systems which can be certified to be trustworthy and robust.

The expert talk series will cover the following topics:

  • Measuring Neural Network Robustness
  • Auditing AI Systems
  • Adversarial Attacks and Defences
  • Explainability & Trustworthiness
  • Poisoning Attacks on AI
  • Certified Robustness
  • Model and Data Uncertainty
  • AI Safety and Fairness

The Trustworthy AI series is moderated by Wojciech Samek, Head of AI Department at Fraunhofer HHI, one of the top 20 AI labs in the world.

Gregoire Montavon: Explainable AI (XAI) and trust

Shownotes

00:00 Opening Remarks by ITU

01:00 Introduction by Samek

01:58 Introduction by Gregoire Montavon

02:27 Why do we need Trust in AI?

  • To do high-stake decisions and for this, the model should be trustworthy.

03:37 Machine Learning Decisions

  • Most of the systems implement machine learning.
  • Machine Learning puts the focus on collecting the data that the decision function has to correctly predict rather than specifying the function by hand
  • Can we trust machine learning models?

04:41 Detecting horse example

  • We have an input image of a horse and by using a machine learning model, we can predict the output labeled as a horse.
  • When detecting images, a classifier relies on a non-informative (water mark e.g.) feature that makes it fool. Clever Hans models are unlikely to perform well on future data.

10:08 but how do we get these heatmaps?

  • Layer-wise propagation

10:52 Layerwise Relevance Propagation (LRP)

  • An advantage is that it runs in the order of a single backward pass (no need to evaluate the function multiple times).

12:48 Can LRP be Justified Theoretically?

13:17 Deep Taylor Decomposition

  • We can apply Taylor expansions at each layer of the neural network.

14:36 LRP is More Stable than Gradient

  • A gradient gives as a result a very noisy output.
  • The neural network has very small high frequency variations.
  • Gradient changes from positive to negative very quickly. (Disadvantage)

16:00 LRP on Different Types of Data / Models

  • Medical data (images/FMRI/EEG).
  • Arcade Games (video games)
  • Natural language (finding relevant words)
  • Speech (voice recognition)
  • DNN Classifiers
  • Anomaly models (strokes)
  • Similarity models

18:23 Advanced Explanation with GNN-LRP

19:00 Systematically Finding Clever Hans

  • The decision artefact has been found occasionally by having the user look at an explanation for some image of the class horse. But can we achieve a broader and more systematic inspection of the model?

19:58 Idea: Spectral Relevance Analysis (SpRAy)

  • Step1: Compute explanations for all data points.
  • Step2: Organize explanations into clusters. Clever Hans effects are now obtained systematically.

22:08 The Revolution of Depth (2012-…)

  • Deep neural networks trained on millions of labeled images.

23:08 Clever Hans on the VGG-16 Image Classifier

  • There are still several Clever Hans effects in this classifier.

23:37 XAI Current Challenges

  • Explanation Fidelity: Explanation must accurately capture the decision strategy of the model. Accurately evaluating explanation fidelity is still an open question.
  • Explanation Understandability: When the decision strategy is complex, the user may not be able to distinguish between a correct and a flawed decision strategy, even if the explanation is correct.
  • Explanation for Validating a ML Model: Even after applying SpRAy, there may in theory still be hidden Clever Hanses in the model (specially for models with strong ability to generalize)
  • Explanation Robustness: XAI is potentially vulnerable to adversarial attacks (e.g. crafting input and models that produce wrong explanations).

28:27 Towards Trustworthy AI

  • High stakes autonomous decision requires trustworthy models. This is so far only fully achievable for simple models.
  • Explainable AI is doing rapid progress to make complex ML models more trustworthy.

30:01 Explainable AI book

30:18 www.heatmapping.org

30:48 References

30:54 Q&A Session

31:10 How to measure trustworthiness and the certification process. Do you think explanations are important and can play a role in such a verification process and how? I am not an expert in auditing/certification. But you can implement these models in the industry. Explanation cannot be used to certify a model on its own, but it can potentially be part of the validation/certification process.

33:32 How does your LRP compare with Google’s XRAI algorithm? I don’t know the XRAI algorithm. Generally, there are different ways to evaluate an explanation technique, and which factors are important is application-dependent.

34:33 What are your thoughts on explanibity models? You can apply in systems such as clustering.

Yes, explainable AI can be applied to clustering.

35:33 Class discrimination in AI methods? Cat vs. dog example. With recent XAI methods, it is possible to produce explanations that are specific to one class. Use different LRP rules at different layers.

37:17 Last time we talked about backdoors, do you think we can use explanation methods to detect backdoors in poisoning attacks? Definitely yes. Once you have identified an artefact, you can identify backdoors, but it may be possible to miss some of them.

39:23 Where explanation is going to (the future of)? Many directions. Make XAI more broadly applicable, make explanations more understandable for humans.

41:07 Do you think there are some limits for explanations, functions that are hard to explain? Some functions perform better than the human, and the human may not be able to understand the explanation of these functions, even if the explanation is correct.

43:10 Have you tried to calculate heatmaps for images which have been altered with adversarial perturbations? We tried this at some point. Often, adversarial perturbations are not interpretable by the human, this can lead to noisy explanations.

44:20 What else can we do with explanations? Model validation, model improvement, understanding the data (e.g. in the sciences).

 

45:49 Closing from ITU

Share this session
In partnership with:
Scroll Up