Echos from the Black Box

Counterfactual Explanations and Probabilistic Methods for Trustworthy Machine Learning

Patrick Altmeyer

Delft University of Technology

Arie van Deursen

Cynthia C. S. Liem

May 14, 2024

Quick Introduction

Currently 2nd year of PhD in Trustworthy Artificial Intelligence at Delft University of Technology.
Previously, educational background in Economics and Finance and two years in Monetary Policy at the Bank of England.
Interested in applying Trustworthy AI to real-world problems, particularly in the financial sector.

Background

Counterfactual Explanations

Counterfactual Explanation (CE) explain how inputs into a model need to change for it to produce different outputs.

Provided the changes are realistic and actionable, they can be used for Algorithmic Recourse (AR) to help individuals who face adverse outcomes.

Example: Consumer Credit

From ‘loan denied’ to ‘loan supplied’: CounterfactualExplanations.jl 📦.

Figure 1: Gradient-based counterfactual search.

Figure 2: Counterfactuals for Give Me Some Credit dataset (Kaggle 2011).

Conformal Prediction

Conformal Prediction is a model-agnostic, distribution-free approach to Predictive Uncertainty Quantification: ConformalPrediction.jl 📦.

Figure 4: Conformal Prediction sets for an Image Classifier.

Joint Energy Models

Joint Energy Models (JEMs) are hybrid models trained to learn the conditional output and input distribution (Grathwohl et al. 2020): JointEnergyModels.jl 📦.

Figure 5: A JEM trained on Circles data.

Research Questions

Recourse Dynamics

We present evidence suggesting that state-of-the-art applications of Algorithmic Recourse to groups of individuals induce large domain and model shifts and propose ways to mitigate this (IEEE SaTML paper).

Joint work with Giovan Angela, Karol Dobiczek, Aleksander Buszydlik, Arie van Deursen and Cynthia C. S. Liem (all TU Delft).

A Balancing Act

Minimizing private costs generates external costs for other stakeholders.
To avoid this, counterfactuals need to be plausible, i.e. comply with the data-generating process.
In practice, costs to various stakeholders need to be carefully balanced.

Is plausibility really all we need?

Pick your Poison?

All of these counterfactuals are valid explanations for the model’s prediction. Which one would you pick?

Figure 6: Turning a 9 into a 7: Counterfactual Examplanations for an Image Classifier.

What do Models Learn?

These images are sampled from the posterior distribution learned by the model. Looks different, no?

Figure 7: Conditional Generated Images from the Image Classifier

ECCCos from the Black Box

We propose a framework for generating Energy-Constrained Conformal Counterfactuals (ECCCos) which explain black-box models faithfully.

Joint work with Mojtaba Framanbar (ING), Arie van Deursen (TU Delft) and Cynthia C. S. Liem (TU Delft).

Figure 8: Gradient fields and counterfactual paths for different generators.

Trustworthy AI in Julia

🐶 Taija

Research informs development, development informs research.

Trustworthy Artificial Intelligence in Julia.

Taija is a collection of open-source packages for Trustworthy AI in Julia. Our goal is to help researchers and practitioners assess the trustworthiness of predictive models.

Our work has been presented at JuliaCon 2022 and will be presented again at JuliaCon 2023 and hopefully beyond.

Questions?

References

Grathwohl, Will, Kuan-Chieh Wang, Joern-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, and Kevin Swersky. 2020. “Your Classifier Is Secretly an Energy Based Model and You Should Treat It Like One.” In International Conference on Learning Representations.

Kaggle. 2011. “Give Me Some Credit, Improve on the State of the Art in Credit Scoring by Predicting the Probability That Somebody Will Experience Financial Distress in the Next Two Years.” https://www.kaggle.com/c/GiveMeSomeCredit; Kaggle. https://www.kaggle.com/c/GiveMeSomeCredit.