Skip to content
Snippets Groups Projects
Commit caf8e920 authored by Pat Alt's avatar Pat Alt
Browse files

all revised

parent 2e46ff3b
No related branches found
No related tags found
1 merge request!4336 rebuttal
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised.
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised. Below we will go through individual points where quotations trace back to reviewer remarks.
We largely agree with some of the weaknesses pointed out and will address these below. To start off, we want to address what is being described as the "major weakness" of our paper: the remark that our results indicate that ECCCo does not directly help with plausibility for "weaker" models. That is mostly correct, but let us try to make the case for why this should not be considered as a weakness at all:
#### Low plausibility (real-world data)
- We would argue that this is a desirable property of ECCCo, if our priority is to understand model behaviour: lower plausibility conditional on high fidelity implies that the model itself has learned implausible explanations for the data (we point to this in lines 237-239, 305-307, 322-324, 340-342, ...).
- We think that this characteristic is desirable for the following reasons:
- For practitioners/researchers this is valuable information indicating that despite good predictive performance, the learned posterior density $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ is high in regions of the input domain that are implausible (in the sense of Def 2.1, i.e. the corresponding true density $p(\mathbf{x}|\mathbf{y^{+}})$ is low in those same regions).
- Instead of using surrogate-aided counterfactual search engines to sample those counterfactuals from $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ that are indeed plausible, we would are that the next point of action in such cases should generally be to improve the model.
- We agree that this places an additional burden on researchers/practitioners, but that does not render ECCCo impractical. In situations where providing actionable recourse is an absolute priority, practitioners can always resort to REVISE and related tools in the short term. Major discrepancies between ECCCo and surrogate-aided tools should then at the very least signal to researchers/practitioners, that the underlying model needs to be improved in the medium-term.
> "The major weakness of this work is that plausibility for non-JEM-based classifiers is very low on 'real-world' datasets (Table 2)."
To conclude, we believe that ECCCo and derivative works have the potential to help us identify models that have learned implausible explanations for the data and improve upon that. To illustrate this, we have relied on gradually improving our classifiers through ensembling and joint energy modelling. We chose to focus on JEMs because:
As we argue in **Point 3** (and to some extent also **Point 2**) of the global rebuttal, we believe that this should not be seen as a weakness at all:
- ECCCo itself uses ideas underlying JEMs.
- JEMs have been shown to have multiple desirable properties including robustness and good predictive uncertainty quantification. Based on the previous literature on counterfactuals, these model properties should generally positively correlate with the plausibility of counterfactuals (and our findings seem to confirm this).
- Conditional on high fidelity, plausibility hinges on the quality of the underlying model.
- Subpar outcomes can therefore be understood as a signal that the model needs to be improved.
We agree with the criticism that the "visual quality of generated counterfactuals seems to be low" and we observe "diversity of generated counterfactuals", but:
As noted in the global rebuttal, we aim to make this intuition even clearer in the paper.
#### Visual quality (MNIST)
> "[...] visual quality of generated counterfactuals seems to be low. [Results] hint to low diversity of generated counterfactuals."
Again, we kindly point to the global rebuttal (**Point 2** and **Point 3**) in this context. Additionally, we note the following:
- The visual quality and diversity of the counterfactuals (Fig. 6 in suppl.) seems to faithfully represent generative property of the model (Fig. 6 in suppl.).
- If diversity is crucial, our implementation is fully compatible with adding additional diversity penalties as in DiCE (Mothilal et al., 2019).
We do agree with the criticism that our work could benefit from including other classes of models that can be expected to learn more plausible explanations than our small MLPs (ResNet, CNN, Transformer, adversarially-trained networks, Bayesian NNs, ...). We also agree that additional, more complex datasets need to be consulted in this context and we intend to tackle this in future work.
We will discuss this more thoroughly in the paper.
#### Closeness desideratum
> "ECCCos seems to generate counterfactuals that heavily change the initial image [...] thereby violating the closeness desideratum."
- We would look at this as the price you have to pay for faithfulness and plausibility.
- Concerning faithfulness, large perturbations in the case of MNIST, for example, seem to reflect the fact that the underlying model is sensitive to perturbations across the entire image, even though the images are very sparse. We would argue that this is an undesirable property of the model, not the explanation.
- Concerning plausibility, larger perturbations are typically necessary to move counterfactuals not simply across the decision boundary, but into dense areas of the target domain. Thus, REVISE, for example, is also often associated with larger perturbations.
- This tradeoff can be governed through penalty strengths: if closeness is a high priority, simply increase the relative size of $\lambda_1$ in Equation (5).
We are happy to highlight this tradeoff in section 7.
#### Datasets
> "The experiments are only conducted on small-scale datasets."
In short, we have relied on illustrative datasets commonly used in similar studies. Please refer to our global rebuttal (in particular **Point 1**) for additional details.
#### Conformal Prediction (ablation)
> "[...] it is unclear if conformal prediction is actually required for ECCCos."
Please refer to **Point 4** in the global rebuttal.
#### Bias towards faithfulness
> "Experimental results for faithfulness are biased since (un)faithfulness is already used during counterfactual optimization as regularizer."
- This is true and we are transparent about this in the paper (line 320 to 322).
- ECCCo is intentionally biased towards faithfulness in the same way that Wachter is intentionally biased towards minimal perturbations.
We are happy to make this point more explicit in section 7.
- We would argue that these are limitations of our work, but not necessarily weaknesses. As we have argued elsewhere, this work was limited in both scope and size. Including more experiments would mean compromising on explanations/elaborations with regard to our setup that to our feeling are critical.
- These limitations could be made more explicit in a camera-ready version of the paper, should it come to that.
#### Other questions
Finally, let us try to answer the specific questions that were raised:
......@@ -34,4 +67,4 @@ Finally, let us try to answer the specific questions that were raised:
This is not performance-optimized code and the bulk of the runtime and allocation is driven by sampling through SGLD. Note that while in our experiments we chose to resample for each individual counterfactual explanation, in practice, sampling could be done once for a given dataset. In any case, the computational burden should typically be lower than the overhead involved in training a sufficiently expressive VAE for REVISE, for example.
We also thank the reviewer for their suggestions and will take these on board. The "ECCCo" vs. "ECCCos" story actually caused us some headache: we eventually tried to highlight that *ECCCo* relates to the generator, hence shown in italic consistent with the other generators. Perhaps it makes more sense to drop the distinction between the two.
\ No newline at end of file
We also thank the reviewer for their suggestions and will take these on board.
\ No newline at end of file
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised.
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised. Below we will go through individual points where quotations trace back to reviewer remarks.
The reviewer has nicely summarised our work and we are happy to see that the main messages of the paper evidently came across. We also appreciate the mentioning of `honest acknowledgment of method limitations' as one of the strengths of the paper, which has indeed been important to us.
#### Summary
The reviewer has nicely summarised our work and we are happy to see that the main messages of the paper evidently came across. We also appreciate the mentioning of "honest acknowledgment of method limitations" as one of the strengths of the paper, which has indeed been important to us.
#### Citation of Welling \& Teh (2011)
> "It may be good to add a citation to [Welling & Teh, 2011] for SGLD on line 144."
Regarding the specific question/suggestion raised by the reviewer, we do actually cite Welling \& Teh (2011) in line 145, but we can move that up to line 144 to make it clearer.
Many thanks!
\ No newline at end of file
#### Need for gradient-access
> "Need for gradient access, e.g. through autodiff, for black-box model under investigation."
This is indeed a limitation of our approach, although it is worth pointing out that many of the existing state-of-the-art approaches to CE rely on gradient access. We do have a paragraph on this in section 7, but are happy to expand on this to the extent possible.
\ No newline at end of file
......@@ -20,7 +20,9 @@ We agree that further work could benefit from including additional datasets and
*Response*:
We agree that further work could benefit from including additional models and will make this point clear in section 7. In line with similar studies, we have chosen simple neural network architectures as our starting point. Moving on from there, our goal has been to understand if we can improve these simple models through joint-energy training, in order to yield more plausible counterfactuals that faithfully convey the improved quality of the underlying model. To this end, we think that our experiments provide sufficient evidence. The size and scope of this work ultimately led us to prioritise this main point. The question about which kind of models yield the most plausible and faithful counterfactuals (e.g. "MLPs, CNNs, or transformer" but also Bayesian NNs, adversarially trained NNs) is interesting in itself, but something we have delegated to future studies. We will be more clear about this in section 7.
We agree that further work could benefit from including additional models and will make this point clear in section 7. In line with similar studies, we have chosen simple neural network architectures as our starting point. Moving on from there, our goal has been to understand if we can improve these simple models through joint-energy training, in order to yield more plausible counterfactuals that faithfully convey the improved quality of the underlying model.
To this end, we think that our experiments provide sufficient evidence. The size and scope of this work ultimately led us to prioritise this main point. To get this point across we focused on JEMs, because they are known to have properties that are naturally aligned with the idea of plausible counterfactuals. The question about which other kinds of models yield plausible and faithful counterfactuals (e.g. "MLPs, CNNs, or transformer" but also Bayesian NNs, adversarially trained NNs) is interesting in itself, but something we have delegated to future studies. We will be more clear about this in section 7.
Nonetheless, to immediately address the reviewers' concerns here, we provide additional qualitative examples for MNIST in the companion PDF. These also include a larger deep ensemble ($n=50$) and a simple CCN (LeNet-5), both of which tend to yield more plausible and less noisy counterfactual images than a simple MLP. For comparison, we have also added the corresponding counterfactuals generated by Wachter. In the context of the large ensemble, improved plausibility appears to be driven by better predictive uncertainty quantification. LeNet-5 seems to benefit to some extent from its network architecture that is more appropriate for image data. Wachter fails to uncover any of this. A more detailed study of different models would indeed be very interesting and we believe that ECCCo facilitates such work.
......@@ -40,4 +42,14 @@ It is important to note here, that ECCCo aims to generate faithful counterfactua
- Instead of using surrogate-aided counterfactual search engines to sample those counterfactuals from $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ that are indeed plausible, we would are that the next point of action in such cases should generally be to improve the model.
- We agree that this places an additional burden on researchers/practitioners, but that does not render ECCCo impractical. In situations where providing actionable recourse is an absolute priority, practitioners can always resort to REVISE and related tools in the short term. Major discrepancies between ECCCo and surrogate-aided tools should then at the very least signal to researchers/practitioners, that the underlying model needs to be improved in the medium term.
Based on the reviewers' observations in this context, we will clarify this tension between faithfulness and plausibility further by sharpening the relevant paragraphs in our paper.
\ No newline at end of file
Based on the reviewers' observations in this context, we will clarify this tension between faithfulness and plausibility further by sharpening the relevant paragraphs in our paper.
### Point 4 (Ablation studies)
*Summary:*
> Some reviewers have pointed at the need for additional ablation studies to assess "if conformal prediction is actually required for ECCCos".
*Response:*
We already do this to some extent: the experiments involving our synhtetic datasets are set up to explicitly address this question. We point out that Conformal Prediction appears to play less of a role than the energy-based constraint (lines 278 to 281). We also note in section 7 that further future work is needed to understand the role of CP better (lines 330 to 332). We are happy to expand on this to the extent possible: one possible explanation for the limited impact of CP could be that CP relies on exchangeability. In other words, the smooth set size penalty may not be as effective as intended when we move out of domain during counterfactual search, because it fails to adequately address epistemic uncertainty. Due to the limited size and scope of this work, we have reserved these types of questions for future work.
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised.
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised. Below we will go through individual points where quotations trace back to reviewer remarks.
The first explicit concern raised is about of mathematical notation.
#### Mathematical notation
> "Some notions are lacking descriptions and explanations"
......@@ -9,7 +9,7 @@ We will make a full pass over all notation, and improve where needed. In the mea
- We state in Definition 4.1 that $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$ "denote[s] the conditional distribution of $\mathbf{x}$ in the target class $\mathbf{y}^{+}$, where $\theta$ denotes the parameters of model $M_{\theta}$" following Grathwohl (2020). In the following sentence (line 137) of the paper we state that this can be understood intuitively as "what the model has learned about the data".
- Both $\varepsilon(\cdot)$ and $\hat{\mathbf{X}}_{\theta,y^{+}}^{n_E}$ are explained in line 146 and lines 168-169, respectively. Additional detail can also be found in the Appendix. To the extent possible, we will extend these explanations.
The second explicit concern raised is about the conditional distribution:
#### Conditional distribution
> "[...] the class-condition distribution $p(\mathbf{x}|\mathbf{y^{+}})$ is existed but unknown and learning this distribution is very challenging especially for structural data"
......@@ -19,14 +19,14 @@ We disagree with the statement that this should be seen as a weakness of our pap
- Learning the generative task is also at the core of related approaches mentioned in the paper like REVISE: as we mention in line 89, the authors of REVISE "propose using a generative model such as a Variational Autoencoder (VAE)" to learn $p(\mathbf{x})$. We also point to other related approaches towards plausibility that all centre around learning the data-generating process of the inputs $X$ (lines 85 to 104).
- Learning $p(\mathbf{x}|\mathbf{y^{+}})$ should generally be easier than learning the unconditional distribution $p(\mathbf{x})$, because the information contained in labels can be leveraged in the latter case.
The next explicit concern raised is about the generalisability and rigorousness of our implausibility metric:
#### Implausibility metric
> "Additionally, the implausibility metric seems not general and rigorous [...]"
- We agree it is not perfect and speak to this in the paper (e.g. lines 297 to 299). But we think that it is an improved, more robust version of the metric that was previously proposed and used in the literature (lines 159 to 166). Nonetheless, we are happy to make this limitation clearer also in section 7.
- The rule-based unary constraint metric proposed in Vo et al. (2023) looks interesting, but the paper will be presented for the first time at KDD in August 2023 and we were not aware of it at the time of writing. Thanks for bringing it to our attention, we are happy to mention it in the same context in section 7.
Concern is also expressed with respect to how we defined "faithfulness".
#### Definiton of "faithfulness"
> "Faithfulness [...] can be understood as the validity and fidelity of counterfactual examples. [...] The definition 4.1 is fine but missing of the details of $p_{\theta}(\mathbf{x}|\mathbf{y^{+}})$. [...] However, it is not clear to me how to [...] use it in [SGLD]."
......@@ -38,7 +38,9 @@ We wish to highlight a possible reviewer misunderstanding with regard to a funda
We will try to clarify this in the paper as much as possible.
Next, the idea to use Conformal Prediction (CP) in this context is mentioned both as a strength
#### Conformal Prediction (CP)
CP in this context is mentioned both as a strength
> "conformal prediction for counter-factual explanation is interesting"
......@@ -52,8 +54,8 @@ We reiterate our motivation here:
- Since CP is model-agnostic, we propose relying on it to relax restrictions that were previously placed on the class of classifiers (lines 183 to 189).
- CP does indeed produce prediction sets in the context of classification. That is why we work with a smooth version of the set size that is compatible with gradient-based counterfactual search, as we explain in some detail in lines 194 to 205 and also in Appendix B.
Finally, there is concern around the rigorousness of our experiments:
#### Experiments
> "The experiments are humble and not really solid to me. [...] the authors need to conduct ablation studies regarding the involving terms in (5)."
We think that experiments of this scale are common in the related literature. Please refer to the global rebuttal for more details. Concerning abalation studies, we already do this to some extent (e.g. lines 278 to 281). We are happy to stress that more ablation studies are needed in the context of lines 330 to 332 in section 7.
\ No newline at end of file
We think that experiments of this scale are common in the related literature. Please refer to the global rebuttal for more details. Concerning **ablation studies**, please refer to **Point 4** in the global rebuttal.
\ No newline at end of file
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised.
Thank you! In this individual response, we will refer back to the main points discussed in the global response where relevant and discuss any other specific points the reviewer has raised. Below we will go through individual points where quotations trace back to reviewer remarks.
We will jump straight to the questions that have been raised.
#### Data and models
Firstly, concerning the limited set of models and real-world datasets (**Question 1** and **Question 3**), please refer to **Point 1** and **Point 2** in the global response, respectively.
> "[...] I still find the experiments with real-world data a bit limited. [...] The focus of the models being tested seems narrow."
Concerning generalisability (**Question 2**), our approach should generalise to any classifier that is differentiable with respect to inputs, consistent with other gradient-based counterfactual generators (Equation 1). Our actual implementation is currently compatible with neural networks trained in Julia and has experimental support for `torch` trained in either Python or R. Even though it is possible to generate counterfactuals for non-differentiable models, it is not immediately obvious to us how SGLD can be applied in this context. An interesting question for future research would be if other scalable and gradient-free methods can be used to sample from the conditional distribution learned by the model.
Firstly, concerning the limited set of models and real-world datasets (Question 1 and Question 3), please refer to **Point 1** and **Point 2** in the global response, respectively.
Finally, concerning connections to causal abstractions and causal explanations (**Question 4**), this is an interesting thought. We would have to think about this more, but there is a possible link to the work by Karimi et al. on counterfactuals through interventions as opposed to perturbations (references in the paper). An idea could be to use the abstracted causal graph as our sampler for ECCCo (instead of SGLD). Combining the approach proposed by Karimi et al. with ideas underlying ECCCo, one could then generate counterfactuals that faithfully describe the causal graph learned by the model, instead of generating counterfactuals that comply with prior causal knowledge. We think this may go beyond the scope of our paper but would be happy to add this to section 7.
\ No newline at end of file
#### Generalisability
> "Is the ECCCos approach adaptable to a broad range of black-box models beyond those discussed?"
Our approach should generalise to any classifier that is differentiable with respect to inputs, consistent with other gradient-based counterfactual generators (Equation 1). Our actual implementation is currently compatible with neural networks trained in Julia and has experimental support for `torch` trained in either Python or R. Even though it is possible to generate counterfactuals for non-differentiable models, it is not immediately obvious to us how SGLD can be applied in this context. An interesting question for future research would be if other scalable and gradient-free methods can be used to sample from the conditional distribution learned by the model.
#### Link to causality
> "There’s a broad literature on causal abstractions and causal model explanations that seems related."
This is an interesting thought. We would have to think about this more, but there is a possible link to the work by Karimi et al. on counterfactuals through interventions as opposed to perturbations (references in the paper). An idea could be to use the abstracted causal graph as our sampler for ECCCo (instead of SGLD). Combining the approach proposed by Karimi et al. with ideas underlying ECCCo, one could then generate counterfactuals that faithfully describe the causal graph learned by the model, instead of generating counterfactuals that comply with prior causal knowledge. We think this may go beyond the scope of our paper but would be happy to add this to section 7.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment