1 Introduction

To start with, we will look at a proof-of-concept that demonstrates the main observation underlying that paper is framed around. In particular, we will use synthetic data to see how endogenous domain shifts and the resulting model shifts can have implications on the validity and cost of algorithmic recourse.

1.1 Data

We begin by generating the synthetic data for a simple binary classification problem. For illustrative purposes, we will use data that is linearly separable. The chart below shows the data \(\mathcal{D}\) at time zero, before any implementation of recourse.

max_obs = 1000
catalogue = load_synthetic(max_obs)
counterfactual_data = catalogue[:linearly_separable]
X = counterfactual_data.X
ys = vec(counterfactual_data.y)
plot()
scatter!(counterfactual_data)

Figure 1.1: Linearly separable synthetic data

1.2 Classifier

To model this data \(\mathcal{D}\) we will use a linear classifier. In particular, as in the paper, we will build a logistic regression model in Flux.jl: a single layer with sigmoid activation.

n_epochs = 100
model = Chain(Dense(2,1))
mod = FluxModel(model)
Models.train(mod, counterfactual_data; n_epochs=n_epochs)
mod_orig = deepcopy(mod)

Figure 1.2 below shows the linear separation of the two classes.

plt_original = plot(mod, counterfactual_data; zoom=0, colorbar=false, title="(a)")
display(plt_original)

Figure 1.2: The baseline model: contours indicate the predicted label; dots indicate observed data points.

1.3 Implementation of Recourse

1.3.1 Generate Counterfactual

γ = 0.50
μ = 0.10
Markdown.parse(
    """
    To generate counterfactual explanations we will rely on the most generic approach. As our decision threshold we will use $(γ*100)% here. In other words, the counterfactual is considered as valid, as soon as the classifier is more convinced that it belongs to the target class (blue) than the non-target class (orange). In each round we will implement recourse for $(μ * 100)% of the individuals in the non-target class. 
    """
)

opt = Flux.Descent(0.01)
gen = GenericGenerator(;decision_threshold=γ, opt=opt)

Figure 1.3 below shows the recourse outcome, which we denote here as \(\mathcal{D}^{\prime}\). The obvious observation at this point is that the resulting counterfactuals, while valid, are distinguishable from the factuals that were always in the target class. This is not a new observation and nor is it entirely surprising. In fact, a lot of recent work in this field has tried to address this issue. In this work, we wonder what happens when we let these sorts of dynamics play out further in practice. While the outcome in (b) is not surprising, it may be much harder to observe so clearly in practice (when the data is more complex).

candidates = findall(ys.==0)
chosen_individuals = rand(candidates, Int(round(μ*length(candidates))))
X′ = copy(X)
y′ = copy(ys)
factuals = select_factual(counterfactual_data,chosen_individuals)
outcome = generate_counterfactual(factuals, 1, counterfactual_data, mod, gen; initialization=:identity)
X′[:,chosen_individuals] = reduce(hcat, @.(selectdim(counterfactual(outcome), 3, 1)))
y′[chosen_individuals] = reduce(vcat,@.(selectdim(counterfactual_label(outcome),3,1)))
counterfactual_data′ = CounterfactualData(X′,y′')
plt_single = plot(mod,counterfactual_data′;zoom=0,colorbar=false,title="(b)")
display(plt_single)

Figure 1.3: The recourse outcome after one round.

1.3.2 Retrain

Suppose the agent in charge of the black-box system has provided recourse to a share of individuals leading to the outcome in Figure 1.3. In practice, models are regularly updated through retraining to account for concept drift, for example. For our experiments, we assume that the agent accepts \(\mathcal{D}^{\prime}\) as its new ground truth. To isolate the endogenous effects we are interested in here from any other effect, we further assume away any exogenous changes to the data that we might expect to occur in practice. Retraining the model on \(\mathcal{D}^{\prime}\) leads to a shift of the decision boundary in the direction of the non-target class (Figure 1.4).

mod = Models.train(mod, counterfactual_data′)
plt_single_retrained = plot(mod,counterfactual_data′;zoom=0,colorbar=false,title="(c)")
display(plt_single_retrained)

1.3.3 Repeat

We finally go on to repeat this process of recourse followed by model updates for multiple rounds. Figure 1.5 below presents the different stages of the experiment side-by-side, where panel (d) represents the outcome after ten rounds.

At first glance, it seems that costs to individuals seeking recourse are gradually reduced as the decision boundary moves in the direction of the non-target class: they need to exert less effort to move to valid counterfactual states. The problem with this idea is, of course, that there is no free lunch. This reduction inflicts a burden on the agent in charge of the black-box: the group of individuals that are now classified as target class individuals looks entirely different from the original group.

Why is this a problem? Let’s assume, for example, that the two synthetic features accurately describe the creditworthiness of individuals seeking loans, where creditworthiness increases in the South-West direction. Non-target class individuals (orange) are denied credit, while target class individuals (blue) receive a loan. Then the population of borrowers in (d) is much riskier than in (a). Any lender (bank) aware of such dynamics would avoid them in practice. They might choose not to offer recourse in the first place, generating a cost to all individuals seeking recourse. Alternatively, they may reward first movers, but stop offering recourse after a few rounds.

This last point makes it clear that the implementation of recourse by one individual may generate external costs for other individuals. This notion motivates the ideas set out in the paper.

i = 2
while i <= 10
    counterfactual_data′ = CounterfactualData(X′,y′')
    candidates = findall(y′.==0)
    chosen_individuals = rand(candidates, Int(round(μ*length(candidates))))
    Models.train(mod, counterfactual_data′)
    factuals = select_factual(counterfactual_data′,chosen_individuals)
    outcome = generate_counterfactual(factuals, 1, counterfactual_data′, mod, gen; initialization=:identity)
    X′[:,chosen_individuals] = reduce(hcat, @.(selectdim(counterfactual(outcome), 3, 1)))
    y′[chosen_individuals] = reduce(vcat,@.(selectdim(counterfactual_label(outcome),3,1)))
    i += 1
end
plt_single_repeat = plot(mod,counterfactual_data′;zoom=0,colorbar=false,title="(d)")

plt = plot(plt_original, plt_single, plt_single_retrained, plt_single_repeat, layout=(1,4), legend=false, axis=nothing, size=(600,165))
savefig(plt, joinpath(www_path, "poc.png"))
savefig(plt, "paper/www/poc.png")
display(plt)

Figure 1.5: The different stages of the experiment.

1.4 Mitigation Strategies

In the paper, we propose three simple mitigation strategies:

More Conservative Decision Thresholds
Classifier Preserving ROAR
Gravitational Counterfactual Explanations

Figure 1.6 shows an illustrative example that demonstrates the differences in counterfactual outcomes when using the various mitigation strategies compared to the baseline approach, that is, Wachter with \(\gamma=0.5\): choosing a higher decision threshold pushes the counterfactual a little further into the target domain; this effect is even stronger for ClaPROAR; finally, using the Gravitational generator the counterfactual ends up all the way inside the target domain.

# Generators:
generators = Dict(
    "Generic (γ=0.5)" => GenericGenerator(opt = opt, decision_threshold=0.5),
    "Generic (γ=0.9)" => GenericGenerator(opt = opt, decision_threshold=0.9),
    "Gravitational" => GravitationalGenerator(opt = opt),
    "ClaPROAR" => ClapROARGenerator(opt = opt)
)

# Counterfactuals
x = select_factual(counterfactual_data, rand(candidates)) 
counterfactuals = Dict([name => generate_counterfactual(x, 1, counterfactual_data, mod_orig, gen;) for (name, gen) in generators])

# Plots:
plts = []
for (name,ce) ∈ counterfactuals
    plt = plot(ce; title=name, colorbar=false, ticks = false, legend=false, zoom=0)
    plts = vcat(plts..., plt)
end
plt = plot(plts..., size=(750,200), layout=(1,4))
savefig(plt, joinpath(www_path, "mitigation.png"))
savefig(plt, "paper/www/mitigation.png")
display(plt)