# **A Journey through model debiasing: from methods to applications**
### *Tutorial @ICIAP2025 - Rome*

#### **Hands on - Model Debiasing**

##### **First Part: synthetic benchmarks**

In this guided tutorial, we will explore some basic implementation of debiasing techniques that can generally applied in Computer Vision and Deep Learning tasks.  
This tutorial will be based on a Work-In-Progress repository that we are customizing starting from the work of [1]. All the basic requirements for datasets logic implementations and methods training procedures will be provided, and we will ask you to fill in some code sections so that you can have a proper hands-on experience.

First, we will utilize one of the most used synthethic benchmarks for Model Debiasing in Image Classification, Colored MNIST [2]. In the first task we will consider the bias annotations available, and we will compare the behavior of a vanilla ERM model with a basic debiasing routine where we just upweight and upsample bias-conflicting training samples.

Next, we will move to the *unsupervised debiasing* setting, where bias annotations are not known. Here, we will ask you to complete the implementation of "Learning from Failure" [2], which is a well established end-to-end debiasing method.

#### Hands-On

As a first thing, we will clone the repository containing all the necessary code for completing this tutorial. Just run the following cell containing the required commands.

In [None]:
!rm -rf JTMD
!git clone https://github.com/ResonantFilter/JTMD.git

A few basic imports, alongside a utility class that will come in hand later. (Basically, it fakes the usage of command line arguments and ArgumentParser)

In [None]:
import torch
from torch.utils import data
from torchvision import transforms
import os

class DotDict(dict):
    """
    A dictionary that allows accessing entries with dot notation.
    """
    def __getattr__(self, name):
        try:
            return self[name]
        except KeyError:
            raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{name}'")

    def __setattr__(self, name, value):
        self[name] = value

    def __delattr__(self, name):
        try:
            del self[name]
        except KeyError:
            raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{name}'")

# Example usage:
# config_dict = {'learning_rate': 0.001, 'batch_size': 32}
# config = DotDict(config_dict)
# print(config.learning_rate)
# config.epochs = 10
# print(config['epochs'])

Now we can start importing components from the repository. We'll start by importing the Colored MNIST dataset logic implementation, a utility function for showing some dataset samples, and a basic trainer for standard ERM training.

In [None]:
import torch
import torchvision
from torch.utils.data import Dataset
import matplotlib.pyplot as plt
import numpy as np
from typing import Literal



def show_bias_image_grid(
    dataset: Dataset,
    num_classes_to_show: int,
    bias_logic: Literal['class_equals_bias', 'bias_is_zero'] = 'class_equals_bias'
):
    print(f"Searching for samples with bias logic: '{bias_logic}'...")


    found_samples = {i: {"aligned": None, "conflicting": None} for i in range(num_classes_to_show)}

    num_slots_to_fill = num_classes_to_show * 2
    filled_slots = 0


    for image, (class_label, bias_label), _ in dataset:
        class_label = class_label.item()
        bias_label = bias_label.item()

        if filled_slots >= num_slots_to_fill:
            break

        if class_label < num_classes_to_show:

            is_aligned = False
            if bias_logic == 'class_equals_bias':
                is_aligned = (class_label == bias_label)
            elif bias_logic == 'bias_is_zero':
                is_aligned = (bias_label == 0)

            sample_type = "aligned" if is_aligned else "conflicting"


            if found_samples[class_label][sample_type] is None:
                found_samples[class_label][sample_type] = image
                filled_slots += 1

    image_grid_list = []
    placeholder_image = torch.zeros_like(dataset[0][0]) # A black image as placeholder

    for i in range(num_classes_to_show):
        aligned_img = found_samples[i]['aligned']
        conflicting_img = found_samples[i]['conflicting']

        image_grid_list.append(aligned_img if aligned_img is not None else placeholder_image)
        image_grid_list.append(conflicting_img if conflicting_img is not None else placeholder_image)

    grid = torchvision.utils.make_grid(image_grid_list, nrow=2, padding=4)

    plt.figure(figsize=(8, num_classes_to_show * 2))
    np_grid = grid.permute(1, 2, 0).numpy()

    plt.imshow(np_grid)
    plt.title(f"Image Grid (Bias Logic: {bias_logic})", fontsize=16)
    plt.axis('off')

    ax = plt.gca()

    img_size = placeholder_image.shape[2]
    padding = 4

    ax.text((img_size + padding)/2, -10, 'Bias-Aligned', ha='center', va='bottom', fontsize=12)
    ax.text(img_size + padding + (img_size + padding)/2, -10, 'Bias-Conflicting', ha='center', va='bottom', fontsize=12)

    for i in range(num_classes_to_show):
        y_pos = i * (img_size + padding) + (img_size / 2)
        ax.text(-20, y_pos, f"Class {i}", ha='right', va='center', fontsize=12, rotation=0)

    plt.show()

In [None]:
from JTMD.datasets.CMNIST import CMNIST
train_set = CMNIST(root="data/cmnist", env="train", bias_amount=95, transform=transforms.ToTensor())

show_bias_image_grid(train_set, 10, bias_logic="class_equals_bias")

Now, we can import the baseline ERM trainer, and set up the required parameter configuration for running the experiment.

In [None]:
from JTMD.methods.erm import ERMTrainer

config = DotDict()
config["dataset"] = "cmnist"
config["arch"] = "mlp"
config["pretrain"] = "none"
config["optimizer"] = "adam"
config["lr"] = 1e-04
config["batch_size"] = 256
config["epoch"] = 20
config["weight_decay"] = 0.0
config["amp"] = False
config["start_seed"] = 0
config["rho"] = 95
config["run_name"] = "cmnist_erm"
config["exp_root"] = config["run_name"]
config["reweight_groups"] = False
config["reweight_classes"] = False
config["seed"] = 0
config["num_workers"] = 2
config["pin_memory"] = False
config["wandb"] = False

os.makedirs(f"logs/{config['exp_root']}", exist_ok = True)

Here we have a snippet showing the core training logic of the ERM training.


```python
class ERMTrainer(BaseTrainer):
    def _setup_method_name_and_default_name(self):
        args = self.args
        args.method = "erm"
        default_name = f"{args.method}_{args.dataset}"
        self.default_name = default_name

    def train(self):
        args = self.args
        self._set_train()
        samples_loss_fn = torch.nn.CrossEntropyLoss(reduction="none")
        losses = AverageMeter()
        meter = SubgroupMetricsTracker(
            num_classes=self.num_classes,
            num_groups=self.num_groups,
            device=self.device,
            log_history=True,
            wandb_logger = wandb if self.args.wandb else None,
            prefix="tr"
        )

        pbar = tqdm(self.train_loader, dynamic_ncols=True)
        for batch, (dat, labels, _) in enumerate(pbar):
            image, target = dat, labels
            obj_gt = target[0]  
            group_gt = labels[1]
            image = image.to(self.device)
            obj_gt = obj_gt.to(self.device)
            group_gt = group_gt.to(self.device)

            with torch.amp.autocast("cuda", enabled=args.amp):
                output = self.classifier(image)
                loss = self.criterion(output, obj_gt)

            self._loss_backward(loss)
            self._optimizer_step(self.optimizer)
            self._scaler_update()
            self.optimizer.zero_grad(set_to_none=True)

            accs = meter.compute_accuracy(output, obj_gt)
            losses = meter.compute_loss(output, obj_gt, samples_loss_fn)
            meter.update(losses, accs, obj_gt, group_gt)

            pbar.set_description(
                f"[{self.cur_epoch}/{args.epoch}] loss: {meter.loss_avg.avg():.4f}"
            )
            meter.log_epoch(self.cur_epoch, aligned_topology="diagonal")
```

This is a classical training loop for ERM learning in PyTorch, plus some code for handling datasets with bias annotations. This is already implemented for you, so you can launch a training by running the following cell.

In [None]:
trainer = ERMTrainer(config)
trainer()

With the following cell, you can evaluate your model and get a visual evaluation of the model performance across each target class and subgroup.

In [None]:
from JTMD.erm_training import evaluate_model

_, performance_meter = evaluate_model(
    trainer.classifier,
    trainer.test_loader,
    trainer.num_classes,
    trainer.num_groups,
    trainer.criterion,
    config["epoch"],
    device=trainer.device,
    wb=None,
    prefix="test",
    config=config
)

performance_meter.plot_subgroup_metrics(show=True, fontsize=7, figsize=(12, 7))

Now we will see a simple Supervised Approach where we basically upsample bias-conflcting samples by using a WeightedRandomSampler for sampling mini-batches. Moreover, we upweight the loss of bias-conflicting samples and downweight the loss coming from bias-aligned samples. This is possible only when bias annotations are fully available.  

This time, we will ask you to fill in the implementation of a train_loader where mini-batches are sampled so that each subgroup is equally represented in each batch.

In [None]:
import torch

import wandb
from tqdm import tqdm
from torch.utils import data
from JTMD.methods.base_trainer import BaseTrainer
from JTMD.utils.advanced_metrics import AverageMeter, SubgroupMetricsTracker


class NaiveSupervisedTrainer(BaseTrainer):
    def __init__(self, args, **kwargs):
        super().__init__(args, **kwargs)
        self._setup_all()
        self.set_train_loader(self.args)

    def set_train_loader(self, config):
        group_counts: torch.Tensor = (
            (torch.arange(self.num_groups * self.num_classes).unsqueeze(1) == self.train_set.group_array)
            .sum(1)
            .float()
        ) # Fast way of counting members of each subgroup
        # Note that we used self.train_set.group_array, a tensor containing the group identifier
        # of each sample

        group_weights = # Fill here
        weights = # Fill here
        sampler = data.WeightedRandomSampler(
            # Fill here
            num_samples=len(self.train_set),
            replacement=True
        )

        train_loader = data.DataLoader(
            self.train_set,
            batch_size=config.batch_size,
            sampler= # Fill here
            shuffle= # Fill here
            num_workers=config.num_workers
        )

        self.train_loader = train_loader

    def _setup_method_name_and_default_name(self):
        args = self.args
        args.method = "naivesupervised"
        default_name = f"{args.method}_{args.dataset}"
        self.default_name = default_name
        args.reweight_groups = True
        self.uw_factor = args.uw_factor
        self.dw_factor = args.dw_factor

    def _setup_criterion(self):
        self.criterion = torch.nn.CrossEntropyLoss(reduction="none")

    def train(self):
        args = self.args
        self._set_train()
        samples_loss_fn = torch.nn.CrossEntropyLoss(reduction="none")
        losses = AverageMeter()
        meter = SubgroupMetricsTracker(
            num_classes=self.num_classes,
            num_groups=self.num_groups,
            device=self.device,
            log_history=True,
            wandb_logger = wandb if self.args.wandb else None,
            prefix="tr"
        )

        pbar = tqdm(self.train_loader, dynamic_ncols=True)
        for batch, (dat, labels, _) in enumerate(pbar):
            image, target = dat, labels
            obj_gt = target[0]
            group_gt = labels[1]
            image = image.to(self.device)
            obj_gt = obj_gt.to(self.device)
            group_gt = group_gt.to(self.device)

            with torch.amp.autocast("cuda", enabled=args.amp):
                output = self.classifier(image)
                loss = self.criterion(output, obj_gt)

                # Put here the code to up/downweight samples
                # You can use both gt labels to do so.


                # Fill here


                loss = loss.mean()

            self._loss_backward(loss)
            self._optimizer_step(self.optimizer)
            self._scaler_update()
            self.optimizer.zero_grad(set_to_none=True)

            accs = meter.compute_accuracy(output, obj_gt)
            losses = meter.compute_loss(output, obj_gt, samples_loss_fn)
            meter.update(losses, accs, obj_gt, group_gt)

            pbar.set_description(
                f"[{self.cur_epoch}/{args.epoch}] loss: {meter.loss_avg.avg():.4f}"
            )
            meter.log_epoch(self.cur_epoch, aligned_topology="diagonal")

In [None]:
config = DotDict()
config["dataset"] = "cmnist"
config["arch"] = "mlp"
config["pretrain"] = "none"
config["optimizer"] = "adam"
config["lr"] = 1e-04
config["batch_size"] = 256
config["epoch"] = 20
config["weight_decay"] = 0.0
config["amp"] = False
config["start_seed"] = 0
config["rho"] = 95
config["run_name"] = "cmnist_sup"
config["exp_root"] = "logs"
config["reweight_groups"] = False
config["reweight_classes"] = False
config["seed"] = 0
config["num_workers"] = 2
config["pin_memory"] = False
config["wandb"] = False
config["uw_factor"] = 10
config["dw_factor"] = 0.1

os.makedirs(f"logs/{config['exp_root']}", exist_ok = True)

Again, you can launch the training by running the following cells

In [None]:
suptrainer = NaiveSupervisedTrainer(config)
suptrainer()

The same goes for model evaluation

In [None]:
from JTMD.erm_training import evaluate_model

_, performance_meter = evaluate_model(
    suptrainer.classifier,
    suptrainer.test_loader,
    suptrainer.num_classes,
    suptrainer.num_groups,
    suptrainer.criterion,
    config["epoch"],
    device=suptrainer.device,
    wb=None,
    prefix="test",
    config=config
)

performance_meter.plot_subgroup_metrics(show=True, fontsize=7, figsize=(8, 5))

Now we will turn to the unsupervised debiasing case, where we do not have access to any annotation or prior information on bias.

This time, we will ask you to fill in some of the required code for running this method. The configuration parameters are already defined for you in the cell below, while the partial implementation will follow next.

In [None]:
config = DotDict()
config["dataset"] = "cmnist"
config["arch"] = "mlp"
config["pretrain"] = "none"
config["optimizer"] = "adam"
config["lr"] = 1e-04
config["batch_size"] = 256
config["epoch"] = 20
config["weight_decay"] = 0.0
config["amp"] = False
config["start_seed"] = 0
config["rho"] = 95
config["run_name"] = "cmnist_lff"
config["exp_root"] = config["run_name"]
config["reweight_groups"] = False
config["reweight_classes"] = False
config["seed"] = 0
config["num_workers"] = 2
config["pin_memory"] = False
config["wandb"] = False

os.makedirs(f"logs/{config['exp_root']}", exist_ok = True)

In [None]:
import torch
import torch.nn as nn

from JTMD.utils.idx_dataset import IdxDataset
from JTMD.utils.EMA_torch_gpu import EMAGPU as EMA
from tqdm import tqdm
from JTMD.models.criterion import GeneralizedCECriterion
from JTMD.models.classifiers import get_classifier
from JTMD.methods.base_trainer import BaseTrainer


class LfFTrainer(BaseTrainer):
    def _method_specific_setups(self):
        train_target_attr = self.train_set.get_labels()
        self.sample_loss_ema_b = EMA(
            torch.LongTensor(train_target_attr), device=self.device, alpha=0.7
        )
        self.sample_loss_ema_d = EMA(
            torch.LongTensor(train_target_attr), device=self.device, alpha=0.7
        )

    def _modify_train_set(self, train_dataset):
        return train_dataset

    def _setup_models(self):
        super(LfFTrainer, self)._setup_models()
        self.bias_discover_net = get_classifier(
            arch=self.args.arch,
            num_classes=self.num_classes,
        ).to(self.device)

    def _setup_criterion(self):
        self.criterion = nn.CrossEntropyLoss(reduction="none")
        self.gce_criterion = GeneralizedCECriterion()

    def _setup_optimizers(self):
        super(LfFTrainer, self)._setup_optimizers()
        args = self.args
        match args.optimizer:
            case "sgd":
                self.optimizer_bias_discover_net = torch.optim.SGD(
                    self.bias_discover_net.parameters(),
                    args.lr,
                    momentum=args.momentum,
                    weight_decay=args.weight_decay,
                )
            case "adamw":
                self.optimizer_bias_discover_net = torch.optim.AdamW(
                    self.bias_discover_net.parameters(),
                    args.lr,
                    weight_decay=args.weight_decay
                )
            case "adam":
                self.optimizer_bias_discover_net = torch.optim.Adam(
                    self.bias_discover_net.parameters(),
                    args.lr,
                    weight_decay=args.weight_decay
                )
            case _:
                raise NotImplementedError

    def _setup_method_name_and_default_name(self):
        args = self.args
        args.method = "lff"
        default_name = f"{args.method}_{args.dataset}"
        self.default_name = default_name

    def train(self):
        args = self.args
        self.bias_discover_net.train()
        self.classifier.train()

        total_cls_loss = 0
        total_ce_loss = 0
        total_gce_loss = 0

        pbar = tqdm(self.train_loader, dynamic_ncols=True)
        for batch, (dat, labels, idx_data) in enumerate(pbar):
            img, target = dat, labels
            label = target[0]
            group_gt = labels[1]
            img = img.to(self.device, non_blocking=True)
            label = label.to(self.device, non_blocking=True)
            group_gt = group_gt.to(self.device, non_blocking=True)

            with torch.amp.autocast("cuda", enabled=args.amp):
                spurious_logits = self.bias_discover_net(img)
                target_logits = self.classifier(img)
                ce_loss = self.criterion(target_logits, label)
                gce_loss = self.gce_criterion(spurious_logits, label).mean()

            loss_b = self.criterion(spurious_logits, label).detach()
            loss_d = ce_loss.detach()

            # EMA sample loss
            self.sample_loss_ema_b.update(loss_b, idx_data)
            self.sample_loss_ema_d.update(loss_d, idx_data)

            # class-wise normalize
            loss_b = self.sample_loss_ema_b.parameter[idx_data].clone().detach()
            loss_d = self.sample_loss_ema_d.parameter[idx_data].clone().detach()

            max_loss_b = self.sample_loss_ema_b.max_loss(label)
            max_loss_d = self.sample_loss_ema_d.max_loss(label)
            loss_b /= max_loss_b
            loss_d /= max_loss_d

            loss_weight = # Fill here
            ce_loss = # Fill here

            loss = ce_loss + gce_loss

            self.optimizer.zero_grad(set_to_none=True)
            self.optimizer_bias_discover_net.zero_grad(set_to_none=True)
            self._loss_backward(loss)
            self._optimizer_step(self.optimizer)
            self._optimizer_step(self.optimizer_bias_discover_net)

            self._scaler_update()

            total_cls_loss += loss.item()
            total_ce_loss += ce_loss.item()
            total_gce_loss += gce_loss.item()
            avg_cls_loss = total_cls_loss / (batch + 1)
            avg_ce_loss = total_ce_loss / (batch + 1)
            avg_gce_loss = total_gce_loss / (batch + 1)

            pbar.set_description(
                "[{}/{}] cls_loss: {:.3f}, ce: {:.3f}, gce: {:.3f}".format(
                    self.cur_epoch,
                    args.epoch,
                    avg_cls_loss,
                    avg_ce_loss,
                    avg_gce_loss,
                )
            )

        log_dict = {
            "loss": total_cls_loss / len(self.train_loader),
            "ce_loss": total_ce_loss / len(self.train_loader),
            "gce_loss": total_gce_loss / len(self.train_loader),
        }
        self.log_to_wandb(log_dict)

    def _state_dict_for_save(self):
        state_dict = super(LfFTrainer, self)._state_dict_for_save()
        state_dict.update(
            {
                "bias_discover_net": self.bias_discover_net.state_dict(),
                "optimizer_bias_discover_net": self.optimizer_bias_discover_net.state_dict(),
            }
        )
        return state_dict

    def _load_state_dict(self, state_dict):
        super(LfFTrainer, self)._load_state_dict(state_dict)
        self.bias_discover_net.load_state_dict(state_dict["bias_discover_net"])
        self.optimizer_bias_discover_net.load_state_dict(
            state_dict["optimizer_bias_discover_net"]
        )


In [None]:
lff_trainer = LfFTrainer(config)
lff_trainer()

In [None]:
from JTMD.erm_training import evaluate_model

_, performance_meter = evaluate_model(
    lff_trainer.classifier,
    lff_trainer.test_loader,
    lff_trainer.num_classes,
    lff_trainer.num_groups,
    lff_trainer.criterion,
    config["epoch"],
    device=trainer.device,
    wb=None,
    prefix="test",
    config=config
)

performance_meter.export_csv()
performance_meter.plot_subgroup_metrics(show=True, fontsize=7, figsize=(8, 5))

##### **Second Part: a realistic case with BAR (Biased Action Recognition)**

In the second part of this hands-on, we will focus to a more realistic scenario. We will exploit a popular benchmark in model debiasing for image classification, BAR (Biased Action Recognition). Originally introduced in [1], we will be using the renewed version from [3], where bias annotations have been made available, and equipped with two bias severity settings.

As we did for CMNIST, we will compare the performance of ERM, a naive supervised debiasing scheme, and an unsupervised method.

In [None]:
from JTMD.datasets.bar import BAR
train_set = BAR("data", bias_amount=95, transform=transforms.Compose([transforms.Resize((224, 224)), transforms.ToTensor()]))

show_bias_image_grid(train_set, 6, bias_logic="bias_is_zero")

In [None]:
config = DotDict()
config["dataset"] = "bar"
config["arch"] = "resnet18"
config["pretrain"] = "default"
config["optimizer"] = "adam"
config["nesterov"] = True
config["momentum"] = 0.9
config["lr"] = 1e-04
config["batch_size"] = 64
config["epoch"] = 30
config["weight_decay"] = 1e-04
config["amp"] = True
config["start_seed"] = 0
config["rho"] = 99
config["run_name"] = "bar_erm"
config["exp_root"] = config["run_name"]
config["reweight_groups"] = False
config["reweight_classes"] = True
config["seed"] = 0
config["num_workers"] = 2
config["pin_memory"] = False
config["wandb"] = False

os.makedirs(f"logs/{config['exp_root']}", exist_ok = True)

In [None]:
trainer = ERMTrainer(config)
trainer()

In [None]:
from JTMD.erm_training import evaluate_model

_, performance_meter = evaluate_model(
    trainer.classifier,
    trainer.test_loader,
    trainer.num_classes,
    trainer.num_groups,
    trainer.criterion,
    config["epoch"],
    device=trainer.device,
    wb=None,
    prefix="test",
    config=config
)

performance_meter.plot_subgroup_metrics(show=True, fontsize=12, figsize=(8, 5))

In [None]:
config = DotDict()
config["dataset"] = "bar"
config["arch"] = "resnet18"
config["pretrain"] = "default"
config["optimizer"] = "adam"
config["lr"] = 1e-04
config["epoch"] = 30
config["batch_size"] = 64
config["weight_decay"] = 1e-04
config["amp"] = True
config["start_seed"] = 0
config["rho"] = 99
config["run_name"] = "bar_lff"
config["exp_root"] = config["run_name"]
config["reweight_groups"] = False
config["reweight_classes"] = True
config["seed"] = 0
config["num_workers"] = 2
config["pin_memory"] = False
config["wandb"] = False

os.makedirs(f"logs/{config['exp_root']}", exist_ok = True)

In [None]:
lff_trainer = LfFTrainer(config)
lff_trainer()

In [None]:
from JTMD.erm_training import evaluate_model

_, performance_meter = evaluate_model(
    lff_trainer.classifier,
    lff_trainer.test_loader,
    lff_trainer.num_classes,
    lff_trainer.num_groups,
    lff_trainer.criterion,
    config["epoch"],
    device=lff_trainer.device,
    wb=None,
    prefix="test",
    config=config
)

performance_meter.plot_subgroup_metrics(show=True, fontsize=8, figsize=(8, 5))

In [None]:
config = DotDict()
config["dataset"] = "bar"
config["arch"] = "resnet18"
config["pretrain"] = "default"
config["optimizer"] = "adam"
config["lr"] = 1e-04
config["batch_size"] = 64
config["epoch"] = 20
config["weight_decay"] = 0.0
config["amp"] = False
config["start_seed"] = 0
config["rho"] = 95
config["run_name"] = "bar_sup"
config["exp_root"] = config["run_name"]
config["reweight_groups"] = False
config["reweight_classes"] = False
config["seed"] = 0
config["num_workers"] = 2
config["pin_memory"] = False
config["wandb"] = False
config["uw_factor"] = 10
config["dw_factor"] = 0.1

os.makedirs(f"logs/{config['exp_root']}", exist_ok = True)

In [None]:
suptrainer = NaiveSupervisedTrainer(config)
suptrainer()

In [None]:
from JTMD.erm_training import evaluate_model

_, performance_meter = evaluate_model(
    suptrainer.classifier,
    suptrainer.test_loader,
    suptrainer.num_classes,
    suptrainer.num_groups,
    suptrainer.criterion,
    config["epoch"],
    device=suptrainer.device,
    wb=None,
    prefix="test",
    config=config
)

performance_meter.export_csv()
performance_meter.plot_subgroup_metrics(show=True, fontsize=7, figsize=(8, 5))

### **References**

[1] Li, Zhiheng, et al. "A whac-a-mole dilemma: Shortcuts come in multiples where mitigating one amplifies others." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.  
[2] Nam, Junhyun, et al. "Learning from failure: De-biasing classifier from biased classifier." Advances in Neural Information Processing Systems 33 (2020): 20673-20684.