Skip to content
Snippets Groups Projects
submission.md 1.52 KiB

Title: ECCCos from the Black Box: Faithful Explanations through Energy-Constrained Conformal Counterfactuals

Keywords: Explainable AI, Counterfactual Explanations, Algorithmic Recourse, Energy-Based Models, Conformal Prediction

Abstract: (see paper)

Corresponding Author: p.altmeyer@tudelft.nl

Revier Nomination: Arie.vanDeursen@tudelft.nl

Primary Area: Interpretability and Explainability

Claims: Yes

Code of Ethics: Yes

Broader Impacts: A narrow focus on generating plausible counterfactuals may lead practitioners and researchers to believe that even a highly vulnerable black-box model has learned plausible data representations. Our work aims to mitigate this.

Limitations: Yes

Theory: While we do not include any theoretical results in terms of formal proofs, we have approached the topic of Counterfactual Explanations from a new theoretical angle in this work. Where necessary we have clearly stated our assumptions.

Experiments: Yes

Training Details: Yes

Error Bars: Yes

Compute: All of our experiments could be run locally on a personal machine. We will provide details regarding training times and compute in the supplementary material.

Reproducibility: Yes

Safeguards: n/a

Licenses: Yes

Assets: Yes

Human Subjects: n/a

IRB Approvals: n/a

TLDR: We leverage ideas from energy-based modelling and conformal prediction to generate faithful Counterfactual Explanations that can distinguish trustworthy from unreliable models.