diff --git a/paper/paper.pdf b/paper/paper.pdf index c2073ea05e680bdb3d380fc0146d21a59edc7ba2..df78c2736dc7c33f5e831bc52f4429fbfdfb3211 100644 Binary files a/paper/paper.pdf and b/paper/paper.pdf differ diff --git a/paper/paper.tex b/paper/paper.tex index bb65d6076dc916958e662395605fadd27a437533..faa5f70fbe2cc3f8e53f4c1631c42a03bbc558ac 100644 --- a/paper/paper.tex +++ b/paper/paper.tex @@ -31,12 +31,16 @@ \usepackage{microtype} % microtypography \usepackage{xcolor} % colors \usepackage{amsmath} +\usepackage{amsthm} \usepackage{graphicx} % Bibliography \bibliographystyle{plainnat} \setcitestyle{numbers,square,comma} +% Numbered Environments: +\newtheorem{definition}{Definition}[section] + \title{Conformal Counterfactual Explanations} @@ -123,7 +127,16 @@ Solutions to Equation~\ref{eq:general} are considered valid as soon as the predi The crucial difference between Adversarial Examples (AE) and Counterfactual Explanations is one of intent. While an AE is intended to go unnoticed, a CE should have certain desirable properties. The literature has made this explicit by introducing various so-called \textit{desiderata}. To properly serve both AI practitioners and individuals affected by AI decision-making systems, counterfactuals should be sparse, proximate~\citep{wachter2017counterfactual}, actionable~\citep{ustun2019actionable}, diverse~\citep{mothilal2020explaining}, plausible~\citep{joshi2019realistic,poyiadzi2020face,schut2021generating}, robust~\citep{upadhyay2021robust,pawelczyk2022probabilistically,altmeyer2023endogenous} and causal~\citep{karimi2021algorithmic} among other things. Researchers have come up with various ways to meet these desiderata, which have been surveyed in~\citep{verma2020counterfactual} and~\citep{karimi2020survey}. -To fulfil this latter goal, researchers have come up with a myriad of ways. @joshi2019realistic were among the first to suggest that instead of searching counterfactuals in the feature space, we can instead traverse a latent embedding learned by a surrogate generative model. Similarly, @poyiadzi2020face use density ... Finally, @karimi2021algorithmic argues that counterfactuals should comply with the causal model that generates them [CHECK IF WE CAN PHASE THIS LIKE THIS]. Other related approaches include ... All of these different approaches have a common goal: they aim to ensure that the generated counterfactuals comply with the (learned) data-generating process (DGB). +Finding ways to generate \textit{plausible} counterfactuals has been one of the primary concerns. To this end, \citet{joshi2019realistic} were among the first to suggest that instead of searching counterfactuals in the feature space $\mathcal{X}$, we can instead traverse a latent embedding $\mathcal{Z}$ that implicitly codifies the data generating process (DGP) of $x\sim\mathcal{X}$. To learn the latent embedding, they introduce a surrogate model. In particular, they propose to use the latent embedding of a Variational Autoencoder (VAE) trained to generate samples $x^*\sim \mathcal{G}(z)$ where $\mathcal{G}$ denotes the decoder part of the VAE. Provided the surrogate model is well-trained, their proposed approach can yield compelling counterfactual explanations like the one in the centre panel of Figure~ref{fig:vae}. + +Others have proposed similar approaches. \citet{dombrowski2021diffeomorphic} traverse the base space of a normalizing flow to solve Equation~\ref{eq:general}, essentially relying on a different surrogate model for the generative task. \citet{poyiadzi2020face} use density estimators ($\hat{p}: \mathcal{X} \mapsto [0,1]$) to constrain the counterfactual paths. \citet{karimi2021algorithmic} argue that counterfactuals should comply with the causal model that generates the data. All of these different approaches share a common goal: ensuring that the generated counterfactuals comply with the true and unobserved DGP. To summarize this broad objective, we propose the following definition: + +\begin{definition}[Plausible Counterfactuals] + \label{def:plausible} + Let $\mathcal{X}|t$ denote the true conditional distribution of samples in the target class. Then for $x^{\prime}$ to be considered a plausible counterfactual, we need: $x^{\prime} \sim \mathcal{X}|t$. +\end{definition} + +Note that Definition~\ref{def:plausible} subsumes the notion of plausible counterfactual paths, since we can simply apply it to each counterfactual state along the path. \begin{figure} \centering