Skip to content
Snippets Groups Projects
Commit 3dd8172c authored by Pat Alt's avatar Pat Alt
Browse files

revision of rebuttals

parent 6a632eb1
No related branches found
No related tags found
1 merge request!4336 rebuttal
Thank you!
We largely agree with some of the weaknesses pointed out and will address these below. To start off, we want to address what is being described as the `major weakness' of our paper: the remark that our results indicate that ECCCo does not directly help with plausibility for `weaker' models. That is mostly correct, but let us try to make the case for why this should not be considered as a weakness at all:
\begin{itemize}
\item We would argue that this is a desirable property of ECCCo, if our priority is to understand model behaviour: lower plausibility conditional on high fidelity implies that the model itself has learned implausible explanations for the data (we point to this in lines 237-239, 305-307, 322-324, 340-342, ...).
\item We think that this characteristic is desirable for the following reasons:
\begin{itemize}
\item For practitioners/researchers this is valuable information indicating that despite good predictive performance, the learned posterior density $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ is high in regions of the input domain that are implausible (in the sense of Def 2.1, i.e. the corresponding true density $p(\mathbf{x}|\mathbf{y^{+}})$ is low in those same regions).
\item Instead of using surrogate-aided counterfactual search engines to sample those counterfactuals from $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ that are indeed plausible, we would are that the next point of action in such cases should generally be to improve the model.
\item We agree that this places an additional burden on researchers/practitioners, but that does not render ECCCo impractical. In situations where providing actionable recourse is an absolute priority, practitioners can always resort to REVISE and related tools in the short term. Major discrepancies between ECCCo and surrogate-aided tools should then at the very least signal to researchers/practitioners, that the underlying model needs to be improved in the medium-term.
\end{itemize}
\end{itemize}
To conclude, we believe that ECCCo and derivative works have the potential to help us identify models that have learned implausible explanations for the data and improve upon that. To illustrate this, we have relied on gradually improving our classifiers through ensembling and joint energy modelling. We chose to focus on JEMs because:
\begin{itemize}
\item ECCCo itself uses ideas underlying JEMs.
\item JEMs have been shown to have multiple desirable properties including robustness and good predictive uncertainty quantification. Based on the previous literature on counterfactuals, these model properties should generally positively correlate with the plausibility of counterfactuals (and our findings seem to confirm this).
\end{itemize}
We agree with the criticism that the `visual quality of generated counterfactuals seems to be low' and we observe `diversity of generated counterfactuals', but:
\begin{itemize}
\item The visual quality and diversity of the counterfactuals (Fig. 6 in suppl.) seems to faithfully represent generative property of the model (Fig. 6 in suppl.).
\item If diversity is crucial, our implementation is fully compatible with adding additional diversity penalties as in DiCE (Mothilal et al., 2019).
\end{itemize}
We do agree with the criticism that our work could benefit from including other classes of models that can be expected to learn more plausible explanations than our small MLPs (ResNet, CNN, Transformer, adversarially-trained networks, Bayesian NNs, ...). We also agree that additional, more complex datasets need to be consulted in this context and we intend to tackle this in future work.
\begin{itemize}
\item We would argue that these are limitations of our work, but not necessarily weaknesses. As we have argued elsewhere, this work was limited in both scope and size. Including more experiments would mean compromising on explanations/elaborations with regard to our setup that to our feeling are critical.
\item These limitations could be made more explicit in a camera-ready version of the paper, should it come to that.
\end{itemize}
Finally, let us try to answer the specific questions that were raised:
\begin{itemize}
\item In line 178 we (belatedly) mention that the L1 Norm is our default choice for dist$(\cdot)$. We realise now that it's not obvious that this also applies to Equations 3 and 4 and will fix that. Note that we also experimented with other distance/similarity metrics, but found the differences in outcomes to be small enough to consistently rely on L1 for its sparsity-inducing properties.
\item $f$ by default just rescales the input data. GMSC data is standardized and MNIST images are rescaled to $[-1,1]$ (mentioned in Appendix D, lines 572-576, but maybe this indeed belongs in the body). $f^{-1}$ is simply the inverse transformation. Synthetic data is not rescaled. We still explicitly mention $f$ here to stay consistent with the generalised notation in Equation (1). For example, $f$/$f^{-1}$ could just as well be a compression/decompression or an encoder/decoder as in REVISE.
\item In all of our experiments we set $\alpha=0.05$ (90\% target coverage) and $\kappa=1$ to avoid penalising sets of size one. We should add this to Appendix D, thanks for flagging. Note that we did experiment with these parameter choices, but as we point out in the paper, more work is need to better understand the role of Conformal Prediction in this context.
\item I have just run ECCCo and Wachter for a single MNIST digit on my machine (no GPU) using default parameters from the experiment:
\begin{itemize}
\item ECCCo: \texttt{4.065607 seconds (4.34 M allocations: 1.011 GiB, 7.62\% gc time)}.
\item Wachter: \texttt{1.899047 seconds (2.16 M allocations: 343.889 MiB, 4.59\% gc time, 74.80\% compilation time)}
\end{itemize}
This is not performance-optimized code and the bulk of the runtime and allocation is driven by sampling through SGLD. Note that while in our experiments we chose to resample for each individual counterfactual explanation, in practice, sampling could be done once for a given dataset. In any case, the computational burden should typically be lower than the overhead involved in training a sufficiently expressive VAE for REVISE, for example.
\end{itemize}
We also thank the reviewr for their suggestions and will take these on board. The `ECCCo' vs. `ECCCos' story actually caused us some headache: we eventually tried to highlight that \textit{ECCCo} relates to the generator, hence shown in italic consistent with the other generators. Perhaps it makes more sense to drop the distinction between the two.
To conclude, we hope that despite its limitations, the reviewer finds enough value and novelty in our work for it to already be shared with the community. We think it provides a fresh perspective on the notion of faithfulness in the context of Counterfactual Explanations. It should hopefully provide an inspiring baseline for others to explore this important topic further.
Many thanks!
\ No newline at end of file
Thank you!
The reviewer has nicely summarised our work and we are happy to see that the main messages of the paper evidently came across. We also appreciate the mentioning of `honest acknowledgment of method limitations' as one of the strengths of the paper, which has indeed been important to us.
Regarding the specific question/suggestion raised by the reviewer, we do actually cite Welling \& Teh (2011) in line 145, but we can move that up to line 144 to make it clearer.
Many thanks!
\ No newline at end of file
We would like to thank all of the reviewers for their detailed and thoughtful reviews — your feedback is truly much appreciated.
Based on the reviewers' helpful suggestions, we plan to extend section 7 to deepen the interpretation of the results presented in this work as well as its limitations. Below, we will respond to points that have been raised by at least two reviewers. Individual responses to each reviewer contain additional points.
### Point 1 (Real-world data)
*Summary:*
> Some reviewers have noted that "experiments with real-world data a bit limited" and "only conducted on small-scale datasets".
*Response*:
We agree that further work could benefit from including additional datasets and will make this point clear in section 7. That being said, we have relied on datasets commonly used in similar studies. Due to the size and scope of this work, we have decided to focus on conveying our motivation, methodology and conclusions through illustrative datasets.
### Point 2 (Models)
*Summary:*
> Some reviewers have noted that "focus of the models being tested seems narrow". The work could benefit from including additional models like "MLPs, CNNs, or transformer".
*Response*:
We agree that further work could benefit from including additional models and will make this point clear in section 7. In line with similar studies, we have chosen simple neural network architectures as our starting point. Moving on from there, our goal has been to understand if we can improve these simple models through joint-energy training, in order to yield more plausible counterfactuals that faithfully convey the improved quality of the underlying model. To this end, we think that our experiments provide sufficient evidence. The size and scope of this work ultimately led us to prioritise this main point. The question about which kind of models yield the most plausible and faithful counterfactuals (e.g. "MLPs, CNNs, or transformer" but also Bayesian NNs, adversarially trained NNs) is interesting in itself, but something we have delegated to future work. We will be more clear about this in section 7.
### Point 3 (Plausibility and Applicability)
*Summary:*
> Some reviewers have expressed concern around whether "ECCCo generates plausible counterfactuals beyond synthetic datasets for non-JEM-based classifiers" and asked for qualitative examples for non-JEM-based coutnerfactuals. Failure to produce plausible counterfactuals "could significantly limit ECCCos’ applicability and utility for researchers as well as practitioners alike".
*Response*:
We agree that additional qualitative examples for MNIST can help to demonstrate that ECCCo does indeed uncover plausible patterns learned by non-JEM-based classifiers. In the companion PDF we provide such examples, which also include a larger deep ensemble and a simple CCN (LeNet-5), both of which tend to yield more plausible counterfactuals than a simple MLP. Based on the reviewers' suggestions we will move these into the supplementary material. Will respect to our other real-world dataset, the results in Table 2 indicate that ECCCo consistently achieves substantially higher plausibility than Wachter.
It is important to note here, that ECCCo aims to generate faithful counterfactuals first and foremost. Plausibility is achieved only to the extent that the underlying model learns plausible explanations for the data. Thus, we disagree that failure to produce plausible counterfactuals would limit ECCCo's usefulness in practice. We argue that this should not be seen as a weakness, but rather as a strength of ECCCo, for the following reasons:
- For practitioners/researchers it is valuable information indicating that despite good predictive performance, the learned posterior density $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ is high in regions of the input domain that are implausible (in the sense of Def 2.1, i.e. the corresponding true density $p(\mathbf{x}|\mathbf{y^{+}})$ is low in those same regions).
- Instead of using surrogate-aided counterfactual search engines to sample those counterfactuals from $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ that are indeed plausible, we would are that the next point of action in such cases should generally be to improve the model.
- We agree that this places an additional burden on researchers/practitioners, but that does not render ECCCo impractical. In situations where providing actionable recourse is an absolute priority, practitioners can always resort to REVISE and related tools in the short term. Major discrepancies between ECCCo and surrogate-aided tools should then at the very least signal to researchers/practitioners, that the underlying model needs to be improved in the medium term.
Based on the reviewers' observations in this context, we will clarify this tension between faithfulness and plausibility further by sharpening the relevant paragraphs in our paper.
\ No newline at end of file
Thank you!
Many of the weaknesses pointed out here seem to centre around mathematical notions, so we will try to address these one-by-one with reference to the corresponding explanations in the paper.
The first explicit concern raised is about `lacking descriptions and explanations' of mathematical notation:
\begin{itemize}
\item We state in Definition 4.1 that $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ `denote[s] the conditional distribution of $\mathbf{x}$ in the target class $\mathbf{y}^{+}$, where $\theta$ denotes the parameters of model $M_{\theta}$'. In other words, the conditional density is parameterised by $\theta$, which to our knowledge is standard notation and in fact the same notation as in Grathwohl (2020) (one of our main reference points). In the following sentence (line 137) of the paper we state in plain English that this can be understood intuitively as `what the model has learned about the data'.
\item Both $\varepsilon(\cdot)$ and $\hat{\mathbf{X}}_{\theta,y^{+}}^{n_E}$ are in our opinion sufficiently explained in line 146 and lines 168-169, respectively. Given the strict page limits, not every concept can be explained thoroughly. We do appreciate the expressed concern, however, and, in fact, our initial more lengthy drafts of the paper did include more textbook-style explanations in these places, which were eventually dropped for the sake of brevity. It is worth noting that additional detail can still be found in the Appendix.
\end{itemize}
The second explicit concern raised is that the conditional `distribution [$p(\mathbf{x}|\mathbf{y^{+}})$] is very challenging especially for structural data'. We disagree with the statement that this should be seen as a weakness of our paper:
\begin{itemize}
\item Even if learning $p(\mathbf{x}|\mathbf{y^{+}})$ was an insurmountable challenge, it should in any case not invalidate the definition itself, which the reviewer seems to agree with.
\item While we agree that learning this distribution is not always trivial, we note that this task is at the very core of Generative Modelling and AI---a field that has recently enjoyed success especially in the context of large unstructured data like images and language.
\item Learning the generative task is also at the core of related approaches mentioned in the paper like REVISE: as we mention in line 89, the authors of REVISE `propose using a generative model such as a Variational Autoencoder (VAE)' to learn plausible explanations. We also point to other related approaches towards plausibility that all centre around learning the data-generating process of the inputs $X$ (lines 85 to 104).
\item Learning $p(\mathbf{x}|\mathbf{y^{+}})$---the core task of Generative AI---should generally be easier than learning the unconditional distribution $p(\mathbf{x})$, because the information contained in labels can be leveraged in the latter case.
\end{itemize}
The next explicit concern raised is about the generalisabilty and rigorousness of our implausibility metric:
\begin{itemize}
\item We agree it is not perfect and do not fail to highlight its limitations in the paper (e.g. lines 297 to 299). But we think that it is an improved, more robust version of the metric that was previously proposed and used in the literature (lines 159 to 166). We did experiment with other distance/similarity metrics, but found the differences negligible enough to rely on L1 as our default metric across datasets and models for its sparsity inducing properties.
\item The rule-based unary constraint metric proposed in Vo et al. (2023) looks interesting, but the paper will be presented for the first time at KDD in August 2023 and we were not aware of it at the time of writing. Thanks for bringing it to our attention.
\end{itemize}
Concern is also expressed with respect to how we defined `faithfulness'. The definition of $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ seems to again cause confusion, but in line with this, we wish to highlight a possible reviewer misunderstanding with regard to a fundamental take in our work:
\begin{itemize}
\item Regarding the point that `faithfulness [...] can be understood as the validity and fidelity of counterfactual examples', that is actually precisely what we argue \textit{against} in Sections 3 and 4. Here we do think we went above and beyond to convey the intuition through illustrative examples, because it forms the motivation for our work.
\item As we point out repeatedly, `any valid counterfactual also has full fidelity by construction'. Any successful adversarial attack on a model is also a valid counterfactual, but it is hard to see how adversarial attacks in isolation faithfully describe model behaviour. That is why we propose a definition of `faithfulness' that works with distributional quantities. Specifically, we want to understand if counterfactuals are consistent with what the model has learned about the data, which is best expressed as $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ (Def. 4.2).
\item The role of SGLD is described in some detail in Section 4.1 (lines 138 to 155) and additional explanations are provided in Appendix A.
\end{itemize}
Finally, the idea to use Conformal Prediction (CP) in this context is mentioned both as a strength---`conformal prediction for counter-factual explanation is interesting'---and a weakness---`motivation of using Conformal Prediction (CP) is not convincing to me'. We reiterate our motivation here:
\begin{itemize}
\item As we explain in some detail (lines 180 to 193) the idea rests on the notion that Predictive Uncertainty estimates can be used to generate plausible counterfactuals as previous work has shown.
\item Since CP is model-agnostic, we propose relying on it to relax restrictions that were previously placed on the class of classifiers (lines 183 to 189).
\item CP does indeed produce prediction sets in the context of classification. That is why we work with a smooth version of the set size that is compatible with gradient-based counterfactual search, as we explain in some detail in lines 194 to 205 and also in Appendix B.
\end{itemize}
We hope this sufficiently addresses at least some of the concerns that were raised and that the reviewer may reconsider their recommendation to reject this paper.
\ No newline at end of file
---
format: pdf
---
\ No newline at end of file
Thank you!
We will jump straight to the questions that have been raised. Firstly, concerning the limited set of models and real-world datasets, we agree that more work is needed here and intend to tackle this in future work.
Concerning generalisability, our approach should generalise to any classifier that is differentiable with respect to inputs, consistent with other gradient-based counterfactual generators (Equation 1). Our actual implementation is currently compatible with neural networks trained in Julia and has experimental support for `torch` trained in either Python or R. Even though it is definitely possible to generate counterfactuals for non-differentiable models, it is not immediately obvious to us how SGLD can be applied in this context. An interesting question for future research would be if there are other scalable and gradient-free methods that can be used to sample from the conditional distribution learned by the model.
\ No newline at end of file
......@@ -370,6 +370,8 @@ function _plot_eccco_mnist(
wide::Bool = false,
img_height::Int = img_height,
plot_factual::Bool = false,
generator::Union{Nothing,CounterfactualExplanations.AbstractGenerator}=nothing,
test_data::Bool = false,
kwrgs...,
)
......@@ -381,20 +383,28 @@ function _plot_eccco_mnist(
x_fact = x
end
# Generate counterfactuals using ECCCo generator:
eccco_generator = ECCCoGenerator(
λ=λ,
temp=temp,
opt=opt,
use_class_loss=use_class_loss,
nsamples=10,
nmin=10,
)
if isnothing(generator)
# Generate counterfactuals using ECCCo generator:
generator = ECCCoGenerator(
λ=λ,
temp=temp,
opt=opt,
use_class_loss=use_class_loss,
nsamples=10,
nmin=10,
)
end
if test_data
data = load_mnist_test()
else
data = counterfactual_data
end
ces = Dict()
for (mod_name, mod) in model_dict
ce = generate_counterfactual(
x_fact, target, counterfactual_data, mod, eccco_generator;
x_fact, target, data, mod, generator;
decision_threshold=γ, max_iter=T,
initialization=:identity,
converge_when=:generator_conditions,
......@@ -438,7 +448,7 @@ function _plot_eccco_mnist(
plt = Plots.plot(plts...; size=(img_height,img_height), kwrgs...)
end
return plt, eccco_generator, ces
return plt, generator, ces
end
```
......@@ -591,6 +601,7 @@ _plt_order = [
"JEM Ensemble",
]
plt_additional_models, _, _ces_ = _plot_eccco_mnist(
λ = [0.125,0.25,0.25],
plt_order = _plt_order,
model_dict=large_model_dict,
wide = true,
......@@ -598,7 +609,7 @@ plt_additional_models, _, _ces_ = _plot_eccco_mnist(
img_height = 150,
)
display(plt_additional_models)
savefig(plt_additional_models, joinpath(output_images_path, "mnist_eccco_additional.png"))
# savefig(plt_additional_models, joinpath(output_images_path, "mnist_eccco_additional.png"))
```
```{julia}
......@@ -631,6 +642,77 @@ for (factual, target) in combos
end
```
```{julia}
λ = [0.1,0.25,0.25]
wachter = WachterGenerator(
λ=λ[1],
opt=eccco_generator.opt
)
combos = [
(9,7),
# (9,7),
# (9,7),
(6,0),
# (6,0),
# (6,0),
(4,9),
# (4,9),
# (4,9),
(8,3),
# (8,3),
# (8,3),
(5,6),
# (5,6),
# (5,6),
]
plts_eccco = []
plts_wachter = []
ces_eccco = []
ces_wachter = []
rngs = []
for (factual, target) in combos
rng = rand(1:10000)
# ECCCo:
plt, _, ces = _plot_eccco_mnist(
factual, target;
λ = λ,
plt_order = _plt_order,
model_dict = large_model_dict,
wide = true,
plot_factual = true,
rng = rng,
img_height = 150
)
display(plt)
push!(plts_eccco, plt)
push!(ces_eccco, reduce(hcat, target_probs.(values(ces))))
# Wachter:
plt, _, ces = _plot_eccco_mnist(
factual, target;
plt_order = _plt_order,
model_dict = large_model_dict,
wide = true,
plot_factual = true,
rng = rng,
img_height = 150,
generator = wachter,
)
display(plt)
push!(plts_wachter, plt)
push!(ces_wachter, reduce(hcat, target_probs.(values(ces))))
push!(rngs, rng)
end
```
### All digits
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment