3  Experimental Results

This is a supplementary appendix to the research paper Endogenous Macrodynamics in Algorithmic Recourse. It contains all of the experimental results, including those not highlighted in the actual paper. It also links to additional information about the proposed mitigation strategies.

4 Synthetic Data

This notebook was used to run the experiments for the synthetic datasets and can be used to reproduce the results in the paper. In the following we first run the experiments and then generate visualizations and tables.

4.1 Experiment

models = [
    :LogisticRegression, 
    :FluxModel, 
    :FluxEnsemble,
]
opt = Flux.Descent(0.01) 
generators = Dict(
    :Greedy=>GreedyGenerator(), 
    :Generic=>GenericGenerator(opt = opt),
    :REVISE=>REVISEGenerator(opt = opt),
    :DICE=>DiCEGenerator(opt = opt),
)
max_obs = 1000
catalogue = load_synthetic(max_obs)
choices = [
    :linearly_separable, 
    :overlapping, 
    :circles, 
    :moons,
]
data_sets = filter(p -> p[1] in choices, catalogue)
experiments = set_up_experiments(data_sets,models,generators)
plts = []
for (exp_name, exp_) in experiments
    for (M_name, M) in exp_.models
        score = round(model_evaluation(M, exp_.test_data),digits=2)
        plt = plot(M, exp_.test_data, title="$exp_name;\n $M_name ($score)")
        # Errors:
        ids = findall(vec(round.(probs(M, exp_.test_data.X)) .!= exp_.test_data.y))
        x_wrongly_labelled = exp_.test_data.X[:,ids]
        scatter!(plt, x_wrongly_labelled[1,:], x_wrongly_labelled[2,:], ms=7.5, color=:red, label="")
        plts = vcat(plts..., plt)
    end
end
plt = plot(plts..., layout=(length(choices),length(models)),size=(length(choices)*300,length(models)*300))
savefig(plt, joinpath(www_path,"models_test_before.png"))
using AlgorithmicRecourseDynamics.Models: model_evaluation
plts = []
for (exp_name, exp_) in experiments
    for (M_name, M) in exp_.models
        score = round(model_evaluation(M, exp_.train_data),digits=2)
        plt = plot(M, exp_.train_data, title="$exp_name;\n $M_name ($score)")
        # Errors:
        ids = findall(vec(round.(probs(M, exp_.train_data.X)) .!= exp_.train_data.y))
        x_wrongly_labelled = exp_.train_data.X[:,ids]
        scatter!(plt, x_wrongly_labelled[1,:], x_wrongly_labelled[2,:], ms=7.5, color=:red, label="")
        plts = vcat(plts..., plt)
    end
end
plt = plot(plts..., layout=(length(choices),length(models)),size=(length(choices)*300,length(models)*300))
savefig(plt, joinpath(www_path,"models_train_before.png"))
n_evals = 5
n_rounds = 50
evaluate_every = Int(round(n_rounds/n_evals))
n_folds = 5
T = 100
results = run_experiments(
    experiments;
    save_path=output_path,evaluate_every=evaluate_every,n_rounds=n_rounds, n_folds=n_folds, T=T
)
Serialization.serialize(joinpath(output_path,"results.jls"),results)
plot_dict = Dict(key => Dict() for (key,val) in results)
fold = 1
for (name, res) in results
    exp_ = res.experiment
    plot_dict[name] = Dict(key => [] for (key,val) in exp_.generators)
    rec_sys = exp_.recourse_systems[fold]
    sys_ids = collect(exp_.system_identifiers)
    M = length(rec_sys)
    for m in 1:M
        model_name, generator_name = sys_ids[m]
        M = rec_sys[m].model
        score = round(model_evaluation(M, exp_.test_data),digits=2)
        plt = plot(M, exp_.test_data, title="$name;\n $model_name ($score)")
        # Errors:
        ids = findall(vec(round.(probs(M, exp_.test_data.X)) .!= exp_.test_data.y))
        x_wrongly_labelled = exp_.test_data.X[:,ids]
        scatter!(plt, x_wrongly_labelled[1,:], x_wrongly_labelled[2,:], ms=7.5, color=:red, label="")
        plot_dict[name][generator_name] = vcat(plot_dict[name][generator_name], plt)
    end
end
plot_dict = Dict(key => reduce(vcat, [plots[key] for plots in values(plot_dict)]) for (key, value) in generators)
for (name, plts) in plot_dict
    plt = plot(plts..., layout=(length(choices),length(models)),size=(length(choices)*300,length(models)*300))
    savefig(plt, joinpath(www_path,"models_test_after_$(name).png"))
end
using AlgorithmicRecourseDynamics.Models: model_evaluation
plot_dict = Dict(key => Dict() for (key,val) in results)
fold = 1
for (name, res) in results
    exp_ = res.experiment
    plot_dict[name] = Dict(key => [] for (key,val) in exp_.generators)
    rec_sys = exp_.recourse_systems[fold]
    sys_ids = collect(exp_.system_identifiers)
    M = length(rec_sys)
    for m in 1:M
        model_name, generator_name = sys_ids[m]
        M = rec_sys[m].model
        data = rec_sys[m].data
        score = round(model_evaluation(M, data),digits=2)
        plt = plot(M, data, title="$name;\n $model_name ($score)")
        # Errors:
        ids = findall(vec(round.(probs(M, data.X)) .!= data.y))
        x_wrongly_labelled = data.X[:,ids]
        scatter!(plt, x_wrongly_labelled[1,:], x_wrongly_labelled[2,:], ms=7.5, color=:red, label="")
        plot_dict[name][generator_name] = vcat(plot_dict[name][generator_name], plt)
    end
end
plot_dict = Dict(key => reduce(vcat, [plots[key] for plots in values(plot_dict)]) for (key, value) in generators)
for (name, plts) in plot_dict
    plt = plot(plts..., layout=(length(choices),length(models)),size=(length(choices)*300,length(models)*300))
    savefig(plt, joinpath(www_path,"models_train_after_$(name).png"))
end

4.2 Plots

results = Serialization.deserialize(joinpath(output_path,"results.jls"));
using Images
line_charts = Dict()
errorbar_charts = Dict()
for (data_name, res) in results
    plt = plot(res)
    Images.save(joinpath(www_path, "line_chart_$(data_name).png"), plt)
    line_charts[data_name] = plt
    plt = plot(res,maximum(res.output.n))
    Images.save(joinpath(www_path, "errorbar_chart_$(data_name).png"), plt)
    errorbar_charts[data_name] = plt
end

4.2.1 Line Charts

Figure 4.1 shows the evolution of the evaluation metrics over the course of the experiment.

(a) Circles

(b) Linearly Separable

(c) Moons

(d) Overlapping

Figure 4.1: Line Charts

4.2.2 Error Bar Charts

Figure 4.2 shows the evaluation metrics at the end of the experiments.

(a) Circles

(b) Linearly Separable

(c) Moons

(d) Overlapping

Figure 4.2: Error Bar Charts

4.3 Bootstrap

n_bootstrap = 100
df = run_bootstrap(results, n_bootstrap; filename=joinpath(output_path,"bootstrap.csv"))
┌──────────┬─────────┬────────────────────┬────────────────────┬───────────┬──────────────┐
│     name │   scope │               data │              model │ generator │ p_value_mean │
│ String31 │ String7 │           String31 │           String31 │   String7 │      Float64 │
├──────────┼─────────┼────────────────────┼────────────────────┼───────────┼──────────────┤
│      mmd │  domain │        overlapping │ LogisticRegression │   Generic │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │    Greedy │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │    REVISE │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │      DICE │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │   Generic │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │    Greedy │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │    REVISE │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │      DICE │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │   Generic │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │      DICE │          0.0 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │   Generic │          0.0 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │    Greedy │          0.0 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │    REVISE │        0.238 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │      DICE │          0.0 │
│      mmd │  domain │ linearly_separable │          FluxModel │   Generic │          0.0 │
│      mmd │  domain │ linearly_separable │          FluxModel │    Greedy │          0.0 │
│      mmd │  domain │ linearly_separable │          FluxModel │    REVISE │        0.188 │
│      mmd │  domain │ linearly_separable │          FluxModel │      DICE │          0.0 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │   Generic │          0.0 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │    REVISE │        0.158 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │      DICE │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │   Generic │        0.104 │
│      mmd │  domain │            circles │ LogisticRegression │    Greedy │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │    REVISE │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │      DICE │       0.1275 │
│      mmd │  domain │            circles │          FluxModel │   Generic │          0.0 │
│      mmd │  domain │            circles │          FluxModel │    Greedy │          0.0 │
│      mmd │  domain │            circles │          FluxModel │    REVISE │        0.996 │
│      mmd │  domain │            circles │          FluxModel │      DICE │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │   Generic │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │    REVISE │          1.0 │
│      mmd │  domain │            circles │       FluxEnsemble │      DICE │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │   Generic │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │    Greedy │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │    REVISE │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │      DICE │          0.0 │
│      mmd │  domain │              moons │          FluxModel │   Generic │          0.0 │
│      mmd │  domain │              moons │          FluxModel │    Greedy │          0.0 │
│      mmd │  domain │              moons │          FluxModel │    REVISE │          0.0 │
│      mmd │  domain │              moons │          FluxModel │      DICE │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │   Generic │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │      DICE │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │    REVISE │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │   Generic │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │    Greedy │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │    REVISE │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │      DICE │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │   Generic │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │      DICE │          0.0 │
│      mmd │   model │ linearly_separable │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │ linearly_separable │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │ linearly_separable │ LogisticRegression │    REVISE │        0.084 │
│      mmd │   model │ linearly_separable │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │ linearly_separable │          FluxModel │   Generic │         0.87 │
│      mmd │   model │ linearly_separable │          FluxModel │    Greedy │        0.616 │
│      mmd │   model │ linearly_separable │          FluxModel │    REVISE │         0.65 │
│      mmd │   model │ linearly_separable │          FluxModel │      DICE │         0.82 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │   Generic │        0.892 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │    Greedy │        0.528 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │    REVISE │        0.642 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │      DICE │        0.878 │
│      mmd │   model │            circles │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │    REVISE │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │            circles │          FluxModel │   Generic │        0.392 │
│      mmd │   model │            circles │          FluxModel │    Greedy │        0.078 │
│      mmd │   model │            circles │          FluxModel │    REVISE │        0.836 │
│      mmd │   model │            circles │          FluxModel │      DICE │        0.516 │
│      mmd │   model │            circles │       FluxEnsemble │   Generic │        0.456 │
│      mmd │   model │            circles │       FluxEnsemble │    Greedy │         0.16 │
│      mmd │   model │            circles │       FluxEnsemble │    REVISE │        0.872 │
│      mmd │   model │            circles │       FluxEnsemble │      DICE │        0.572 │
│      mmd │   model │              moons │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │    REVISE │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │              moons │          FluxModel │   Generic │        0.716 │
│      mmd │   model │              moons │          FluxModel │    Greedy │         0.17 │
│      mmd │   model │              moons │          FluxModel │    REVISE │         0.67 │
│      mmd │   model │              moons │          FluxModel │      DICE │        0.588 │
│      mmd │   model │              moons │       FluxEnsemble │   Generic │        0.634 │
│      mmd │   model │              moons │       FluxEnsemble │    Greedy │        0.092 │
│      mmd │   model │              moons │       FluxEnsemble │    REVISE │         0.75 │
│      mmd │   model │              moons │       FluxEnsemble │      DICE │         0.61 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │        overlapping │          FluxModel │   Generic │          0.0 │
│ mmd_grid │   model │        overlapping │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │        overlapping │          FluxModel │    REVISE │        0.004 │
│ mmd_grid │   model │        overlapping │          FluxModel │      DICE │          0.0 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │   Generic │          0.0 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │    REVISE │        0.024 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │      DICE │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │   Generic │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │    REVISE │        0.018 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │      DICE │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │   Generic │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │    REVISE │        0.014 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │      DICE │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │            circles │          FluxModel │   Generic │          0.0 │
│ mmd_grid │   model │            circles │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │            circles │          FluxModel │    REVISE │        0.152 │
│ mmd_grid │   model │            circles │          FluxModel │      DICE │        0.004 │
│ mmd_grid │   model │            circles │       FluxEnsemble │   Generic │          0.0 │
│ mmd_grid │   model │            circles │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │            circles │       FluxEnsemble │    REVISE │         0.14 │
│ mmd_grid │   model │            circles │       FluxEnsemble │      DICE │        0.002 │
│ mmd_grid │   model │              moons │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │              moons │          FluxModel │   Generic │        0.004 │
│ mmd_grid │   model │              moons │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │              moons │          FluxModel │    REVISE │        0.074 │
│ mmd_grid │   model │              moons │          FluxModel │      DICE │        0.016 │
│ mmd_grid │   model │              moons │       FluxEnsemble │   Generic │        0.002 │
│ mmd_grid │   model │              moons │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │              moons │       FluxEnsemble │    REVISE │        0.044 │
│ mmd_grid │   model │              moons │       FluxEnsemble │      DICE │          0.0 │
└──────────┴─────────┴────────────────────┴────────────────────┴───────────┴──────────────┘

4.4 Chart in paper

Figure 4.3 shows the chart that went into the paper.

Images.load(joinpath(www_artifact_path,"paper_synthetic_results.png"))

Figure 4.3: Chart in paper

# echo: false

generate_artifacts(output_path)
generate_artifacts(www_path)

5 Real-World Data

models = [
    :LogisticRegression, 
    :FluxModel, 
    :FluxEnsemble
]
opt = Flux.Descent(0.01) 
generators = Dict(
    :Greedy=>GreedyGenerator(), 
    :Generic=>GenericGenerator(opt = opt),
    :REVISE=>REVISEGenerator(opt = opt),
    :DICE=>DiCEGenerator(opt = opt),
)
max_obs = 5000
data_sets = load_real_world(max_obs)
choices = [
    :cal_housing, 
    :credit_default, 
    :gmsc, 
]
data_sets = filter(p -> p[1] in choices, data_sets)
using CounterfactualExplanations.DataPreprocessing: unpack
bs = 500
function data_loader(data::CounterfactualData)
    X, y = unpack(data)
    data = Flux.DataLoader((X,y),batchsize=bs)
    return data
end
model_params = (batch_norm=false,n_hidden=64,n_layers=3,dropout=true,p_dropout=0.1)
experiments = set_up_experiments(
    data_sets,models,generators; 
    pre_train_models=100, model_params=model_params, 
    data_loader=data_loader
)

5.1 Experiment

n_evals = 5
n_rounds = 50
evaluate_every = Int(round(n_rounds/n_evals))
n_folds = 5
n_samples = 10000
T = 100
generative_model_params = (epochs=250, latent_dim=8)
results = run_experiments(
    experiments;
    save_path=output_path,evaluate_every=evaluate_every,n_rounds=n_rounds, n_folds=n_folds, T=T, n_samples=n_samples,
    generative_model_params=generative_model_params
)
Serialization.serialize(joinpath(output_path,"results.jls"),results)

5.1.1 Plots

results = Serialization.deserialize(joinpath(output_path,"results.jls"))
using Images
line_charts = Dict()
errorbar_charts = Dict()
for (data_name, res) in results
    plt = plot(res)
    Images.save(joinpath(www_path, "line_chart_$(data_name).png"), plt)
    line_charts[data_name] = plt
    plt = plot(res,maximum(res.output.n))
    Images.save(joinpath(www_path, "errorbar_chart_$(data_name).png"), plt)
    errorbar_charts[data_name] = plt
end

5.1.2 Line Charts

Figure 5.1 shows the evolution of the evaluation metrics over the course of the experiment.

img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"line_chart")]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) California Housing

(b) Credit Default

(c) GMSC

Figure 5.1: Line Charts

5.1.3 Error Bar Charts

Figure 5.2 shows the evaluation metrics at the end of the experiments.

img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"errorbar_chart")]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) California Housing

(b) Credit Default

(c) GMSC

Figure 5.2: Error Bar Charts

5.2 Bootstrap

n_bootstrap = 100
df = run_bootstrap(results, n_bootstrap; filename=joinpath(output_path,"bootstrap.csv"))
┌──────────┬─────────┬────────────────┬────────────────────┬───────────┬──────────────┐
│     name │   scope │           data │              model │ generator │ p_value_mean │
│ String31 │ String7 │       String15 │           String31 │   String7 │      Float64 │
├──────────┼─────────┼────────────────┼────────────────────┼───────────┼──────────────┤
│      mmd │  domain │ credit_default │ LogisticRegression │   Generic │        0.594 │
│      mmd │  domain │ credit_default │ LogisticRegression │    REVISE │          0.0 │
│      mmd │  domain │ credit_default │ LogisticRegression │      DICE │        0.268 │
│      mmd │  domain │ credit_default │ LogisticRegression │    Greedy │        0.388 │
│      mmd │  domain │ credit_default │          FluxModel │   Generic │        0.668 │
│      mmd │  domain │ credit_default │          FluxModel │    REVISE │          0.0 │
│      mmd │  domain │ credit_default │          FluxModel │      DICE │        0.466 │
│      mmd │  domain │ credit_default │          FluxModel │    Greedy │        0.998 │
│      mmd │  domain │ credit_default │       FluxEnsemble │   Generic │        0.738 │
│      mmd │  domain │ credit_default │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │  domain │ credit_default │       FluxEnsemble │      DICE │        0.634 │
│      mmd │  domain │ credit_default │       FluxEnsemble │    Greedy │          1.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │   Generic │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │    REVISE │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │      DICE │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │    Greedy │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │   Generic │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │    REVISE │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │      DICE │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │    Greedy │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │   Generic │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │      DICE │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │   Generic │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │    REVISE │        0.112 │
│      mmd │  domain │           gmsc │ LogisticRegression │      DICE │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │    Greedy │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │   Generic │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │    REVISE │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │      DICE │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │    Greedy │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │   Generic │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │      DICE │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │    REVISE │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │   Generic │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │    REVISE │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │      DICE │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │    Greedy │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │   Generic │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │      DICE │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │    REVISE │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │   Generic │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │    REVISE │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │      DICE │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │    Greedy │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │   Generic │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │      DICE │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │    Greedy │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │   Generic │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │    REVISE │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │      DICE │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │    Greedy │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │   Generic │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │    REVISE │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │      DICE │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │    Greedy │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │   Generic │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │    REVISE │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │      DICE │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │   Generic │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │    REVISE │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │      DICE │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │   Generic │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │    REVISE │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │      DICE │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │   Generic │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │    REVISE │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │      DICE │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │   Generic │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │    REVISE │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │      DICE │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │    Greedy │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │   Generic │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │    REVISE │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │      DICE │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │    Greedy │          0.0 │
│ mmd_grid │   model │           gmsc │          FluxModel │   Generic │         0.06 │
│ mmd_grid │   model │           gmsc │          FluxModel │    REVISE │          0.0 │
│ mmd_grid │   model │           gmsc │          FluxModel │      DICE │          0.0 │
│ mmd_grid │   model │           gmsc │          FluxModel │    Greedy │          0.0 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │   Generic │          0.0 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │    REVISE │          0.0 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │      DICE │        0.002 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │    Greedy │          0.0 │
└──────────┴─────────┴────────────────┴────────────────────┴───────────┴──────────────┘

5.2.1 Chart in paper

Figure 5.3 shows the chart that went into the paper.

using DataFrames, Statistics
model_ = :FluxEnsemble
df = DataFrame() 
for (key, val) in results
    df_ = deepcopy(val.output)
    df_.dataset .= key
    df = vcat(df,df_)
end
df = df[df.n .== maximum(df.n),:]
df = df[df.model .== model_,:]
filter!(:value => x -> !any(f -> f(x), (ismissing, isnothing, isnan)), df)
gdf = groupby(df, [:generator, :dataset, :n, :name, :scope])
df_plot = combine(gdf, :value => (x -> [(mean(x),mean(x)+std(x),mean(x)-std(x))]) => [:mean, :ymax, :ymin])
df_plot = df_plot[[name in [:mmd, :model_performance] for name in df_plot.name],:]
df_plot = mapcols(x -> typeof(x) == Vector{Symbol} ? string.(x) : x, df_plot)
df_plot.name .= [r[:name] == "mmd" ? "$(r[:name])_$(r[:scope])" : r[:name] for r in eachrow(df_plot)]
transform!(df_plot, :dataset => (X -> [x=="cal_housing" ? "California Housing" : x for x in X]) => :dataset)
transform!(df_plot, :dataset => (X -> [x=="credit_default" ? "Credit Default" : x for x in X]) => :dataset)
transform!(df_plot, :dataset => (X -> [x=="gmsc" ? "GMSC" : x for x in X]) => :dataset)
transform!(df_plot, :name => (X -> [x=="mmd_domain" ? "MMD (domain)" : x for x in X]) => :name)
transform!(df_plot, :name => (X -> [x=="mmd_model" ? "MMD (model)" : x for x in X]) => :name)
transform!(df_plot, :name => (X -> [x=="model_performance" ? "Performance" : x for x in X]) => :name)
transform!(df_plot, :generator => (X -> [x=="REVISE" ? "Latent" : x for x in X]) => :generator)

ncol = length(unique(df_plot.dataset))
nrow = length(unique(df_plot.name))

using RCall
scale_ = 1.75
R"""
library(ggplot2)
plt <- ggplot($df_plot) +
    geom_bar(aes(x=n, y=mean, fill=generator), stat="identity", alpha=0.5, position="dodge") +
    geom_pointrange( aes(x=n, y=mean, ymin=ymin, ymax=ymax, colour=generator), alpha=0.9, position=position_dodge(width=0.9), size=0.5) +
    facet_grid(
        rows = vars(name),
        cols =  vars(dataset), 
        scales = "free_y"
    ) +
    labs(y = "Value") + 
    scale_fill_discrete(name="Generator:") +
    scale_colour_discrete(name="Generator:") +
    theme(
        axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank(),
        legend.position="bottom"
    )
temp_path <- file.path(tempdir(), "plot.png")
ggsave(temp_path,width=$ncol * $scale_,height=$nrow * $scale_ * 0.8) 
"""

img = Images.load(rcopy(R"temp_path"))
Images.save(joinpath(www_path,"paper_real_world_results.png"), img)
Images.load(joinpath(www_artifact_path,"paper_real_world_results.png"))

Figure 5.3: Chart in paper

6 Mitigation Strategies

models = [
    :LogisticRegression, 
    :FluxModel, 
    :FluxEnsemble,
]
opt = Flux.Descent(0.01) 
generators = Dict(
    :Generic=>GenericGenerator(opt = opt, decision_threshold=0.5),
    :Latent=>REVISEGenerator(opt = opt),
    :Generic_conservative=>GenericGenerator(opt = opt, decision_threshold=0.9),
    :Gravitational=>GravitationalGenerator(opt = opt),
    :ClapROAR=>ClapROARGenerator(opt = opt)
)

6.1 Synthetic

max_obs = 1000
catalogue = load_synthetic(max_obs)
choices = [
    :linearly_separable, 
    :overlapping, 
    :circles, 
    :moons,
]
data_sets = filter(p -> p[1] in choices, catalogue)
experiments = set_up_experiments(data_sets,models,generators)
n_evals = 5
n_rounds = 50
evaluate_every = Int(round(n_rounds/n_evals))
n_folds = 5
T = 100
using Serialization
results = run_experiments(
    experiments;
    save_path=output_path,evaluate_every=evaluate_every,n_rounds=n_rounds, n_folds=n_folds, T=T
)
Serialization.serialize(joinpath(output_path,"results_synthetic.jls"),results)

6.2 Plots

using Serialization
results = Serialization.deserialize(joinpath(output_path,"results_synthetic.jls"))
using Images
line_charts = Dict()
errorbar_charts = Dict()
for (data_name, res) in results
    plt = plot(res)
    Images.save(joinpath(www_path, "line_chart_$(data_name).png"), plt)
    line_charts[data_name] = plt
    plt = plot(res,maximum(res.output.n))
    Images.save(joinpath(www_path, "errorbar_chart_$(data_name).png"), plt)
    errorbar_charts[data_name] = plt
end

6.2.1 Line Charts

Figure 6.1 shows the evolution of the evaluation metrics over the course of the experiment.

choices = [
    :linearly_separable, 
    :overlapping, 
    :circles, 
    :moons,
]
img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"line_chart") .&& .!contains.(readdir(www_artifact_path),"latent")]
img_files = img_files[Bool.(reduce(+, map(choice -> contains.(img_files, string(choice)), choices)))]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) California Housing

(b) Circles

(c) Credit Default

(d) GMSC

Figure 6.1: Line Charts

6.2.2 Error Bar Charts

Figure 6.2 shows the evaluation metrics at the end of the experiments.

choices = [
    :linearly_separable, 
    :overlapping, 
    :circles, 
    :moons,
]
img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"errorbar_chart") .&& .!contains.(readdir(www_artifact_path),"latent")]
img_files = img_files[Bool.(reduce(+, map(choice -> contains.(img_files, string(choice)), choices)))]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) California Housing

(b) Circles

(c) Credit Default

(d) GMSC

Figure 6.2: Error Bar Charts

6.3 Bootstrap

n_bootstrap = 100
df = run_bootstrap(results, n_bootstrap; filename=joinpath(output_path,"bootstrap_synthetic.csv"))
┌──────────┬─────────┬────────────────────┬────────────────────┬──────────────────────┬──────────────┐
│     name │   scope │               data │              model │            generator │ p_value_mean │
│ String31 │ String7 │           String31 │           String31 │             String31 │      Float64 │
├──────────┼─────────┼────────────────────┼────────────────────┼──────────────────────┼──────────────┤
│      mmd │  domain │        overlapping │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │        Gravitational │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │             ClapROAR │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │               Latent │          0.0 │
│      mmd │  domain │        overlapping │          FluxModel │              Generic │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │               Latent │          0.0 │
│      mmd │  domain │        overlapping │ LogisticRegression │              Generic │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │               Latent │          0.0 │
│      mmd │  domain │        overlapping │       FluxEnsemble │              Generic │          0.0 │
│      mmd │  domain │ linearly_separable │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │  domain │ linearly_separable │          FluxModel │        Gravitational │        0.184 │
│      mmd │  domain │ linearly_separable │          FluxModel │             ClapROAR │          0.0 │
│      mmd │  domain │ linearly_separable │          FluxModel │               Latent │        0.358 │
│      mmd │  domain │ linearly_separable │          FluxModel │              Generic │          0.0 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │        Gravitational │         0.27 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │               Latent │        0.586 │
│      mmd │  domain │ linearly_separable │ LogisticRegression │              Generic │          0.0 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │        Gravitational │        0.152 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │               Latent │         0.38 │
│      mmd │  domain │ linearly_separable │       FluxEnsemble │              Generic │          0.0 │
│      mmd │  domain │            circles │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │  domain │            circles │          FluxModel │        Gravitational │          0.0 │
│      mmd │  domain │            circles │          FluxModel │             ClapROAR │          0.0 │
│      mmd │  domain │            circles │          FluxModel │               Latent │          1.0 │
│      mmd │  domain │            circles │          FluxModel │              Generic │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │               Latent │          0.0 │
│      mmd │  domain │            circles │ LogisticRegression │              Generic │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │  domain │            circles │       FluxEnsemble │               Latent │          1.0 │
│      mmd │  domain │            circles │       FluxEnsemble │              Generic │          0.0 │
│      mmd │  domain │              moons │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │  domain │              moons │          FluxModel │        Gravitational │          0.0 │
│      mmd │  domain │              moons │          FluxModel │             ClapROAR │          0.0 │
│      mmd │  domain │              moons │          FluxModel │               Latent │          0.0 │
│      mmd │  domain │              moons │          FluxModel │              Generic │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │               Latent │          0.0 │
│      mmd │  domain │              moons │ LogisticRegression │              Generic │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │               Latent │          0.0 │
│      mmd │  domain │              moons │       FluxEnsemble │              Generic │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │        Gravitational │        0.024 │
│      mmd │   model │        overlapping │          FluxModel │             ClapROAR │        0.012 │
│      mmd │   model │        overlapping │          FluxModel │               Latent │          0.0 │
│      mmd │   model │        overlapping │          FluxModel │              Generic │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │        Gravitational │        0.006 │
│      mmd │   model │        overlapping │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │               Latent │          0.0 │
│      mmd │   model │        overlapping │ LogisticRegression │              Generic │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │        Gravitational │        0.018 │
│      mmd │   model │        overlapping │       FluxEnsemble │             ClapROAR │        0.004 │
│      mmd │   model │        overlapping │       FluxEnsemble │               Latent │          0.0 │
│      mmd │   model │        overlapping │       FluxEnsemble │              Generic │          0.0 │
│      mmd │   model │ linearly_separable │          FluxModel │ Generic_conservative │        0.852 │
│      mmd │   model │ linearly_separable │          FluxModel │        Gravitational │         0.89 │
│      mmd │   model │ linearly_separable │          FluxModel │             ClapROAR │         0.84 │
│      mmd │   model │ linearly_separable │          FluxModel │               Latent │        0.722 │
│      mmd │   model │ linearly_separable │          FluxModel │              Generic │         0.83 │
│      mmd │   model │ linearly_separable │ LogisticRegression │ Generic_conservative │        0.176 │
│      mmd │   model │ linearly_separable │ LogisticRegression │        Gravitational │        0.898 │
│      mmd │   model │ linearly_separable │ LogisticRegression │             ClapROAR │        0.866 │
│      mmd │   model │ linearly_separable │ LogisticRegression │               Latent │        0.338 │
│      mmd │   model │ linearly_separable │ LogisticRegression │              Generic │        0.002 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │ Generic_conservative │        0.868 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │        Gravitational │        0.912 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │             ClapROAR │        0.916 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │               Latent │        0.822 │
│      mmd │   model │ linearly_separable │       FluxEnsemble │              Generic │        0.818 │
│      mmd │   model │            circles │          FluxModel │ Generic_conservative │        0.356 │
│      mmd │   model │            circles │          FluxModel │        Gravitational │         0.48 │
│      mmd │   model │            circles │          FluxModel │             ClapROAR │        0.484 │
│      mmd │   model │            circles │          FluxModel │               Latent │        0.802 │
│      mmd │   model │            circles │          FluxModel │              Generic │        0.482 │
│      mmd │   model │            circles │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │               Latent │          0.0 │
│      mmd │   model │            circles │ LogisticRegression │              Generic │          0.0 │
│      mmd │   model │            circles │       FluxEnsemble │ Generic_conservative │        0.462 │
│      mmd │   model │            circles │       FluxEnsemble │        Gravitational │        0.282 │
│      mmd │   model │            circles │       FluxEnsemble │             ClapROAR │        0.534 │
│      mmd │   model │            circles │       FluxEnsemble │               Latent │         0.91 │
│      mmd │   model │            circles │       FluxEnsemble │              Generic │        0.408 │
│      mmd │   model │              moons │          FluxModel │ Generic_conservative │        0.568 │
│      mmd │   model │              moons │          FluxModel │        Gravitational │        0.868 │
│      mmd │   model │              moons │          FluxModel │             ClapROAR │        0.594 │
│      mmd │   model │              moons │          FluxModel │               Latent │        0.598 │
│      mmd │   model │              moons │          FluxModel │              Generic │        0.604 │
│      mmd │   model │              moons │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │               Latent │          0.0 │
│      mmd │   model │              moons │ LogisticRegression │              Generic │          0.0 │
│      mmd │   model │              moons │       FluxEnsemble │ Generic_conservative │         0.48 │
│      mmd │   model │              moons │       FluxEnsemble │        Gravitational │        0.922 │
│      mmd │   model │              moons │       FluxEnsemble │             ClapROAR │        0.584 │
│      mmd │   model │              moons │       FluxEnsemble │               Latent │        0.654 │
│      mmd │   model │              moons │       FluxEnsemble │              Generic │        0.412 │
│ mmd_grid │   model │        overlapping │          FluxModel │ Generic_conservative │        0.002 │
│ mmd_grid │   model │        overlapping │          FluxModel │        Gravitational │        0.034 │
│ mmd_grid │   model │        overlapping │          FluxModel │             ClapROAR │        0.022 │
│ mmd_grid │   model │        overlapping │          FluxModel │               Latent │        0.008 │
│ mmd_grid │   model │        overlapping │          FluxModel │              Generic │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │        Gravitational │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │             ClapROAR │        0.002 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │        overlapping │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │ Generic_conservative │        0.006 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │        Gravitational │        0.034 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │             ClapROAR │        0.058 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │               Latent │        0.016 │
│ mmd_grid │   model │        overlapping │       FluxEnsemble │              Generic │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │ Generic_conservative │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │        Gravitational │        0.062 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │             ClapROAR │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │               Latent │          0.0 │
│ mmd_grid │   model │ linearly_separable │          FluxModel │              Generic │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │        Gravitational │        0.004 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │             ClapROAR │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │ linearly_separable │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │ Generic_conservative │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │        Gravitational │        0.128 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │             ClapROAR │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │               Latent │          0.0 │
│ mmd_grid │   model │ linearly_separable │       FluxEnsemble │              Generic │          0.0 │
│ mmd_grid │   model │            circles │          FluxModel │ Generic_conservative │          0.0 │
│ mmd_grid │   model │            circles │          FluxModel │        Gravitational │        0.002 │
│ mmd_grid │   model │            circles │          FluxModel │             ClapROAR │        0.004 │
│ mmd_grid │   model │            circles │          FluxModel │               Latent │        0.176 │
│ mmd_grid │   model │            circles │          FluxModel │              Generic │        0.008 │
│ mmd_grid │   model │            circles │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │        Gravitational │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │             ClapROAR │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │            circles │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │            circles │       FluxEnsemble │ Generic_conservative │          0.0 │
│ mmd_grid │   model │            circles │       FluxEnsemble │        Gravitational │          0.0 │
│ mmd_grid │   model │            circles │       FluxEnsemble │             ClapROAR │        0.002 │
│ mmd_grid │   model │            circles │       FluxEnsemble │               Latent │         0.24 │
│ mmd_grid │   model │            circles │       FluxEnsemble │              Generic │          0.0 │
│ mmd_grid │   model │              moons │          FluxModel │ Generic_conservative │          0.0 │
│ mmd_grid │   model │              moons │          FluxModel │        Gravitational │        0.052 │
│ mmd_grid │   model │              moons │          FluxModel │             ClapROAR │        0.002 │
│ mmd_grid │   model │              moons │          FluxModel │               Latent │        0.024 │
│ mmd_grid │   model │              moons │          FluxModel │              Generic │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │        Gravitational │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │             ClapROAR │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │              moons │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │              moons │       FluxEnsemble │ Generic_conservative │          0.0 │
│ mmd_grid │   model │              moons │       FluxEnsemble │        Gravitational │        0.164 │
│ mmd_grid │   model │              moons │       FluxEnsemble │             ClapROAR │          0.0 │
│ mmd_grid │   model │              moons │       FluxEnsemble │               Latent │        0.058 │
│ mmd_grid │   model │              moons │       FluxEnsemble │              Generic │          0.0 │
└──────────┴─────────┴────────────────────┴────────────────────┴──────────────────────┴──────────────┘

6.4 Chart in paper

Figure 6.3 shows the chart that went into the paper.

Images.load(joinpath(www_artifact_path,"paper_synthetic_results.png"))

Figure 6.3: Chart in paper

6.6 Plots

6.6.1 Line Charts

Figure 6.4 shows the evolution of the evaluation metrics over the course of the experiment.

img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"line_chart") .&& contains.(readdir(www_artifact_path),"latent")]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) Circles

(b) Linearly Separable

(c) Moons

(d) Overlapping

Figure 6.4: Line Charts

6.6.2 Error Bar Charts

Figure 6.5 shows the evaluation metrics at the end of the experiments.

img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"errorbar_chart") .&& contains.(readdir(www_artifact_path),"latent")]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) Circles

(b) Linearly Separable

(c) Moons

(d) Overlapping

Figure 6.5: Error Bar Charts

6.7 Bootstrap

n_bootstrap = 100
df = run_bootstrap(results, n_bootstrap; filename=joinpath(output_path,"bootstrap_latent.csv"))

6.8 Chart in paper

Figure 6.6 shows the chart that went into the paper.

Images.load(joinpath(www_artifact_path,"paper_synthetic_latent_results.png"))

Figure 6.6: Chart in paper

6.9 Real World

models = [
    :LogisticRegression, 
    :FluxModel, 
    :FluxEnsemble,
]
opt = Flux.Descent(0.01) 
generators = Dict(
    :Generic=>GenericGenerator(opt = opt, decision_threshold=0.5),
    :Latent=>REVISEGenerator(opt = opt),
    :Generic_conservative=>GenericGenerator(opt = opt, decision_threshold=0.9),
    :Gravitational=>GravitationalGenerator(opt = opt),
    :ClapROAR=>ClapROARGenerator(opt = opt)
)
max_obs = 2500
data_path = data_dir("real_world")
data_sets = load_real_world(max_obs)
choices = [
    :cal_housing, 
    :credit_default, 
    :gmsc, 
]
data_sets = filter(p -> p[1] in choices, data_sets)
using CounterfactualExplanations.DataPreprocessing: unpack
bs = 500
function data_loader(data::CounterfactualData)
    X, y = unpack(data)
    data = Flux.DataLoader((X,y),batchsize=bs)
    return data
end
model_params = (batch_norm=false,n_hidden=64,n_layers=3,dropout=true,p_dropout=0.1)
experiments = set_up_experiments(
    data_sets,models,generators; 
    pre_train_models=100, model_params=model_params, 
    data_loader=data_loader
)
n_evals = 5
n_rounds = 50
evaluate_every = Int(round(n_rounds/n_evals))
n_folds = 5
n_samples = 10000
T = 100
generative_model_params = (epochs=250, latent_dim=8)
results = run_experiments(
    experiments;
    save_path=output_path,evaluate_every=evaluate_every,n_rounds=n_rounds, n_folds=n_folds, T=T, n_samples=n_samples,
    generative_model_params=generative_model_params
)
Serialization.serialize(joinpath(output_path,"results_real_world.jls"),results)
using Serialization
results = Serialization.deserialize(joinpath(output_path,"results_real_world.jls"))
using Images
line_charts = Dict()
errorbar_charts = Dict()
for (data_name, res) in results
    plt = plot(res)
    Images.save(joinpath(www_path, "line_chart_$(data_name).png"), plt)
    line_charts[data_name] = plt
    plt = plot(res,maximum(res.output.n))
    Images.save(joinpath(www_path, "errorbar_chart_$(data_name).png"), plt)
    errorbar_charts[data_name] = plt
end

6.9.1 Line Charts

Figure 5.1 shows the evolution of the evaluation metrics over the course of the experiment.

choices = [
    :cal_housing, 
    :credit_default, 
    :gmsc, 
]
img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"line_chart")]
img_files = img_files[Bool.(reduce(+, map(choice -> contains.(img_files, string(choice)), choices)))]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) California Housing

(b) Credit Default

(c) GMSC

Figure 6.7: Line Charts

6.9.2 Error Bar Charts

Figure 5.2 shows the evaluation metrics at the end of the experiments.

choices = [
    :cal_housing, 
    :credit_default, 
    :gmsc, 
]
img_files = readdir(www_artifact_path)[contains.(readdir(www_artifact_path),"errorbar_chart")]
img_files = img_files[Bool.(reduce(+, map(choice -> contains.(img_files, string(choice)), choices)))]
img_files = joinpath.(www_artifact_path,img_files)
for img in img_files
    display(load(img))
end

(a) California Housing

(b) Credit Default

(c) GMSC

Figure 6.8: Error Bar Charts

6.9.3 Bootstrap

n_bootstrap = 100
df = run_bootstrap(results, n_bootstrap; filename=joinpath(output_path,"bootstrap_real_world.csv"))
┌──────────┬─────────┬────────────────┬────────────────────┬──────────────────────┬──────────────┐
│     name │   scope │           data │              model │            generator │ p_value_mean │
│ String31 │ String7 │       String15 │           String31 │             String31 │      Float64 │
├──────────┼─────────┼────────────────┼────────────────────┼──────────────────────┼──────────────┤
│      mmd │  domain │ credit_default │       FluxEnsemble │               Latent │          0.0 │
│      mmd │  domain │ credit_default │       FluxEnsemble │ Generic_conservative │          1.0 │
│      mmd │  domain │ credit_default │       FluxEnsemble │             ClapROAR │          1.0 │
│      mmd │  domain │ credit_default │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │  domain │ credit_default │       FluxEnsemble │              Generic │          1.0 │
│      mmd │  domain │ credit_default │          FluxModel │               Latent │          0.0 │
│      mmd │  domain │ credit_default │          FluxModel │ Generic_conservative │          1.0 │
│      mmd │  domain │ credit_default │          FluxModel │             ClapROAR │          1.0 │
│      mmd │  domain │ credit_default │          FluxModel │        Gravitational │          0.0 │
│      mmd │  domain │ credit_default │          FluxModel │              Generic │          1.0 │
│      mmd │  domain │ credit_default │ LogisticRegression │               Latent │          0.0 │
│      mmd │  domain │ credit_default │ LogisticRegression │ Generic_conservative │        0.256 │
│      mmd │  domain │ credit_default │ LogisticRegression │             ClapROAR │          1.0 │
│      mmd │  domain │ credit_default │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │  domain │ credit_default │ LogisticRegression │              Generic │          1.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │               Latent │        0.398 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │  domain │    cal_housing │       FluxEnsemble │              Generic │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │               Latent │        0.204 │
│      mmd │  domain │    cal_housing │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │             ClapROAR │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │        Gravitational │          0.0 │
│      mmd │  domain │    cal_housing │          FluxModel │              Generic │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │               Latent │          0.2 │
│      mmd │  domain │    cal_housing │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │  domain │    cal_housing │ LogisticRegression │              Generic │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │               Latent │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │  domain │           gmsc │       FluxEnsemble │              Generic │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │               Latent │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │             ClapROAR │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │        Gravitational │          0.0 │
│      mmd │  domain │           gmsc │          FluxModel │              Generic │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │               Latent │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │  domain │           gmsc │ LogisticRegression │              Generic │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │               Latent │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │   model │ credit_default │       FluxEnsemble │              Generic │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │               Latent │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │             ClapROAR │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │        Gravitational │          0.0 │
│      mmd │   model │ credit_default │          FluxModel │              Generic │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │               Latent │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │   model │ credit_default │ LogisticRegression │              Generic │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │               Latent │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │   model │    cal_housing │       FluxEnsemble │              Generic │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │               Latent │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │             ClapROAR │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │        Gravitational │          0.0 │
│      mmd │   model │    cal_housing │          FluxModel │              Generic │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │               Latent │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │   model │    cal_housing │ LogisticRegression │              Generic │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │               Latent │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │ Generic_conservative │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │             ClapROAR │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │        Gravitational │          0.0 │
│      mmd │   model │           gmsc │       FluxEnsemble │              Generic │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │               Latent │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │ Generic_conservative │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │             ClapROAR │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │        Gravitational │          0.0 │
│      mmd │   model │           gmsc │          FluxModel │              Generic │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │               Latent │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │ Generic_conservative │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │             ClapROAR │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │        Gravitational │          0.0 │
│      mmd │   model │           gmsc │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │               Latent │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │ Generic_conservative │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │             ClapROAR │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │        Gravitational │          0.0 │
│ mmd_grid │   model │ credit_default │       FluxEnsemble │              Generic │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │               Latent │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │ Generic_conservative │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │             ClapROAR │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │        Gravitational │          0.0 │
│ mmd_grid │   model │ credit_default │          FluxModel │              Generic │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │             ClapROAR │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │        Gravitational │          0.0 │
│ mmd_grid │   model │ credit_default │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │               Latent │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │ Generic_conservative │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │             ClapROAR │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │        Gravitational │          0.0 │
│ mmd_grid │   model │    cal_housing │       FluxEnsemble │              Generic │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │               Latent │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │ Generic_conservative │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │             ClapROAR │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │        Gravitational │          0.0 │
│ mmd_grid │   model │    cal_housing │          FluxModel │              Generic │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │             ClapROAR │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │        Gravitational │          0.0 │
│ mmd_grid │   model │    cal_housing │ LogisticRegression │              Generic │          0.0 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │               Latent │        0.016 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │ Generic_conservative │        0.008 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │             ClapROAR │          0.0 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │        Gravitational │          0.0 │
│ mmd_grid │   model │           gmsc │       FluxEnsemble │              Generic │          0.0 │
│ mmd_grid │   model │           gmsc │          FluxModel │               Latent │        0.072 │
│ mmd_grid │   model │           gmsc │          FluxModel │ Generic_conservative │          0.2 │
│ mmd_grid │   model │           gmsc │          FluxModel │             ClapROAR │        0.244 │
│ mmd_grid │   model │           gmsc │          FluxModel │        Gravitational │        0.056 │
│ mmd_grid │   model │           gmsc │          FluxModel │              Generic │         0.23 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │               Latent │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │ Generic_conservative │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │             ClapROAR │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │        Gravitational │          0.0 │
│ mmd_grid │   model │           gmsc │ LogisticRegression │              Generic │          0.0 │
└──────────┴─────────┴────────────────┴────────────────────┴──────────────────────┴──────────────┘

6.9.4 Chart in paper

Figure 6.9 shows the chart that went into the paper.

Images.load(joinpath(www_artifact_path,"paper_real_world_results.png"))

Figure 6.9: Chart in paper