author_response.md 1.85 KiB
Author Response
- Applied to additional commonly used tabular real-world datasets
- Constraining energy directly
- Better results across the board, in particular for image data
- Derived from JEM loss function -> more theoretically grounded
- No sampling overhead.
- Energy does not depend on differentiability.
- Benchmarks no longer biased with respect to unfaithfulness metric (addressing reviewer concern).
- Counterfactual explanations do not scale well to high-dimensional input data
- We have added native support for multi-processing and multi-threading.
- We have run more extensive experiments including fine-tuning hyperparameter choices.
- For image data we use PCA to map counterfactuals to a smaller dimenionsional latent space, which not only reduces costs of gradient computations but also leads to higher plausibility.
- PCA is much less costly and interventionist than a VAE: pricipal component merely represent variation in the data; nothing else about the data is learned by the surrogate.
- ECCCo-(latent) remains faithful, although not as faithful as standard ECCCo-.
- ECCCo-
- We have revisited the mathematical notation.
- We have moved the introduction of conformal prediction forward and added more detail in line with reviewer feedback.
- We have extended the limitations section.
- Distance metric
- We have revisited the distance metrics and decided to use the L2 Norm for plausibility and faithfulness
- Orginially, we used the L1 Norm in line with how the the closeness criterium is commonly evaluated. But in this context the L1 Norm implicitly addresses the desire for sparsity.
- In the case of image data, we investigated various additional distance metrics:
- Cosine similarity
- Euclidean distance
- Ultimately we chose to rely on structural dissimilarity.