Skip to content
Snippets Groups Projects
Commit fcb015a6 authored by pat-alt's avatar pat-alt
Browse files

minor things

parent 1c205972
No related branches found
No related tags found
1 merge request!2Camera ready branch
......@@ -64,7 +64,7 @@ MMD({X}^\prime,\tilde{X}^\prime) &= \frac{1}{m(m-1)}\sum_{i=1}^m\sum_{j\neq i}^m
\end{aligned}
\end{equation}
where $X=\{x_1,...,x_m\}$, $\tilde{X}=\{\tilde{x}_1,...,\tilde{x}_n\}$ represent independent and identically distributed samples drawn from probability distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$ respectively @gretton2012kernel. MMD is a measure of the distance between the kernel mean embeddings of $\mathcal{X}$ and $\mathcal{\tilde{X}}$ in a Reproducing Kernel Hilbert Space, $\mathcal{H}$ [@berlinet2011reproducing]. An important consideration is the choice of the kernel function $k(\cdot,\cdot)$. In our implementation we make use of a Gaussian kernel with a constant length-scale parameter of $0.5$. As the Gaussian kernel captures all moments of distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$, we have that $MMD(X,\tilde{X})=0$ if and only if $X=\tilde{X}$. Conversely, larger values $MMD(X,\tilde{X})>0$ indicate that it is more likely that $\mathcal{X}$ and $\mathcal{\tilde{X}}$ are different distributions. In our context, large values therefore indicate that a domain shift indeed seems to have occurred.
where $X=\{x_1,...,x_m\}$, $\tilde{X}=\{\tilde{x}_1,...,\tilde{x}_n\}$ represent independent and identically distributed samples drawn from probability distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$ respectively @gretton2012kernel. MMD is a measure of the distance between the kernel mean embeddings of $\mathcal{X}$ and $\mathcal{\tilde{X}}$ in a Reproducing Kernel Hilbert Space, $\mathcal{H}$ [@berlinet2011reproducing]. An important consideration is the choice of the kernel function $k(\cdot,\cdot)$. In our implementation, we make use of a Gaussian kernel with a constant length-scale parameter of $0.5$. As the Gaussian kernel captures all moments of distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$, we have that $MMD(X,\tilde{X})=0$ if and only if $X=\tilde{X}$. Conversely, larger values $MMD(X,\tilde{X})>0$ indicate that it is more likely that $\mathcal{X}$ and $\mathcal{\tilde{X}}$ are different distributions. In our context, large values, therefore, indicate that a domain shift indeed seems to have occurred.
To assess the statistical significance of the observed shifts under the null hypothesis that samples $X$ and $\tilde{X}$ were drawn from the same probability distribution, we follow @arcones1992bootstrap. To that end, we combine the two samples and generate a large number of permutations of $X + \tilde{X}$. Then, we split the permuted data into two new samples $X^\prime$ and $\tilde{X}^\prime$ having the same size as the original samples. Then under the null hypothesis, we should have that $MMD(X^\prime,\tilde{X}^\prime)$ be approximately equal to $MMD(X,\tilde{X})$. The corresponding $p$-value can then be calculated by counting how these two quantities are not equal.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment