Train with LibAUC Trainer

Author: Siqi Guo, Gang Li, Tianbao Yang

Introduction

The LibAUC Trainer is a high-level training interface that turns a YAML config file into a complete training run — data loading, model construction, loss/optimizer wiring, evaluation, and checkpointing are all handled automatically.

This tutorial walks through a two-stage workflow for AUROC maximization:

Stage	Goal	Config
1	Warm-up with cross-entropy pretraining	`ce_config.yaml` (`BCELoss` + `Adam`)
2	Fine-tune for AUROC with AUC-aware loss	`aucmloss_config.yaml` (`AUCMLoss` + `PESG`)

Note

Pretraining first and then switching to an AUC loss typically yields higher AUROC than training with the AUC loss from scratch, because the model starts from a meaningful feature representation.

Quick Start

# Stage 1 — cross-entropy warm-up
python -m libauc.trainer.run_image_trainer --config_file ce_config.yaml

# Stage 2 — AUROC optimization
python -m libauc.trainer.run_image_trainer --config_file aucmloss_config.yaml

Any field in the YAML can be overridden directly on the command line:

python -m libauc.trainer.run_image_trainer --config_file aucmloss_config.yaml \
    --epochs 50 --batch_size 64 --sampling_rate 0.3

Supported Loss / Optimizer Pairings

The loss and optimizer fields must be a compatible pair.

Task	Loss	Optimizer
AUROC (binary)	`AUCMLoss`	`PESG`
AUROC (multi-label)	`MultiLabelAUCMLoss` (auto-selected)	`PESG`
Compositional AUROC	`CompositionalAUCLoss`	`PDSCA`
Average Precision	`APLoss`	`SOAP`
Partial AUROC (CVaR)	`pAUC_CVaR_Loss` / `pAUCLoss` (mode: `SOPA`)	`SOPA`
Partial AUROC (DRO)	`pAUC_DRO_Loss` / `pAUCLoss` (mode: `1w`)	`SOPAs`
Two-way partial AUROC (KL)	`tpAUC_KL_Loss` / `pAUCLoss` (mode: `2w`)	`SOTAs`
Two-way partial AUROC (CVaR)	`tpAUC_CVaR_loss`	`STACO`
NDCG	`NDCGLoss`	`SONG`
CE pretraining (SGD)	`CrossEntropyLoss`	`SGD`
CE pretraining (Adam)	`BCELoss`	`Adam`

For parameter details see the libauc.optimizers API reference.

Two-Stage Training Walkthrough

Step 1: CE Pretraining

Create ce_config.yaml:

# ── Dataset: which dataset to load, splits to evaluate, and class-imbalance ratio ──
dataset:
  name: cifar10          # see "Supported Datasets" below
  eval_splits: [val, test]

# ── Model: architecture, weight initialization, and output format ──────────────────
model:
  name: resnet18         # see "Supported Models" below
  pretrained: false

# ── Metrics: evaluation metrics computed on every eval split ───────────────────────
metrics:
  - AUROC                # see "Supported Metrics" below

# ── Training: hyperparameters, loss, optimizer, and checkpointing ──────────────────
training:
  # Experiment metadata
  project_name: libauc
  experiment_name: resnet18_ce_cifar10
  SEED: 2026

  # Data loading
  epochs: 100
  batch_size: 128
  eval_batch_size: 256
  sampling_rate: 0.5     # positive fraction per batch (DualSampler)
  num_workers: 2
  decay_epochs: [0.5, 0.75]   # fractions of total epochs, or absolute ints

  # Loss
  loss: BCELoss
  loss_kwargs: {}

  # Optimizer
  optimizer: Adam
  optimizer_kwargs:
    lr: 1.0e-3
    weight_decay: 1.0e-4

  # Output and checkpointing
  output_path: ./output
  resume_from_checkpoint: false
  save_checkpoint_every: 5
  verbose: 1             # 0 = silent | 1 = progress bar | 2 = one line/epoch

Run it:

python -m libauc.trainer.run_image_trainer --config_file ce_config.yaml

Expected checkpoint path after training:

./output/resnet18_ce_cifar10/epoch_100.pt

Step 2: AUROC Optimization

Create aucmloss_config.yaml:

# ── Dataset: same split and imbalance ratio as Stage 1 ────────────────────────────
dataset:
  name: cifar10
  eval_splits: [val, test]

# ── Model: fine-tune from the Stage-1 checkpoint ──────────────────────────────────
model:
  name: resnet18
  pretrained: true
  pretrained_path: "./output/resnet18_ce_cifar10/epoch_100.pt"

# ── Metrics: evaluation metrics computed on every eval split ───────────────────────
metrics:
  - AUROC

# ── Training: AUCMLoss and PESG ────────────────────────────────────
training:
  # Experiment metadata
  project_name: libauc
  experiment_name: resnet18_AUCMLoss_cifar10
  SEED: 2026

  # Data loading
  epochs: 100
  batch_size: 128
  eval_batch_size: 256
  sampling_rate: 0.2
  num_workers: 2
  decay_epochs: [0.5, 0.75]

  # Loss
  loss: AUCMLoss
  loss_kwargs:         # hyper-parameters for AUCMLoss
    margin: 1.0

  # Optimizer
  optimizer: PESG
  optimizer_kwargs:    # hyper-parameters for PESG
    lr: 0.1
    epoch_decay: 0.002
    weight_decay: 1.0e-5
    momentum: 0.9

  # Output and checkpointing
  output_path: ./output
  resume_from_checkpoint: false
  save_checkpoint_every: 5
  verbose: 1

Run it:

python -m libauc.trainer.run_image_trainer --config_file aucmloss_config.yaml

Recipe Library

Ready-to-run config files for common tasks. Download a file and pass it directly to the CLI entry point:

python -m libauc.trainer.run_image_trainer --config_file <downloaded_file>
python -m libauc.trainer.run_graph_trainer  --config_file <downloaded_file>

Image classifier recipes

Config	View on Github	Loss / Optimizer	Description
`config_ce_Cifar10.yaml`	GitHub	BCELoss / Adam	Stage 1 warm-up on imbalanced CIFAR-10 (ResNet-18, imratio 0.1)
`config_auc_Cifar10.yaml`	GitHub	AUCMLoss / PESG	Stage 2 binary AUROC maximisation on CIFAR-10, fine-tuned from Stage 1
`config_auprc_Cifar10.yaml`	GitHub	APLoss / SOAP	Average-precision (AUPRC) optimisation on CIFAR-10, fine-tuned from Stage 1
`config_opauc_Cifar10.yaml`	GitHub	pAUCLoss (1w) / SOPAs	One-way partial AUROC (FPR ≤ 0.3) on CIFAR-10, fine-tuned from Stage 1
`config_tpauc_Cifar10.yaml`	GitHub	pAUCLoss (2w) / SOTAs	Two-way partial AUROC on CIFAR-10, fine-tuned from Stage 1
`config_compositional_auc_Cifar10.yaml`	GitHub	CompositionalAUCLoss / PDSCA	Compositional AUROC on CIFAR-10 from scratch (ResNet-20)
`config_ce_BreastMNIST.yaml`	GitHub	BCELoss / Adam	Stage 1 warm-up on BreastMNIST (binary, grayscale ultrasound)
`config_auc_BreastMNIST.yaml`	GitHub	AUCMLoss / PESG	Stage 2 AUROC maximisation on BreastMNIST, fine-tuned from Stage 1
`config_ce_PneumoniaMNIST.yaml`	GitHub	BCELoss / Adam	Stage 1 warm-up on PneumoniaMNIST (binary, chest X-ray)
`config_auc_PneumoniaMNIST.yaml`	GitHub	AUCMLoss / PESG	Stage 2 AUROC maximisation on PneumoniaMNIST, fine-tuned from Stage 1
`config_auc_ChestMNIST.yaml`	GitHub	MultiLabelAUCMLoss / PESG	Multi-label AUROC on ChestMNIST (14 thorax disease labels)
`config_auc_melanoma.yaml`	GitHub	AUCMLoss / PESG	Binary AUROC on the melanoma dermoscopy dataset (ResNet-18)
`config_auc_ddsm.yaml`	GitHub	AUCMLoss / PESG	Binary AUROC on DDSM mammography (DenseNet-121, ImageNet warm-start)

Graph neural network recipes

Config	View on Github	Loss / Optimizer	Description
`config_ce_ogbg-molhiv.yaml`	GitHub	BCELoss / Adam	Stage 1 warm-up on ogbg-molhiv (GINE, binary molecular property)
`config_auprc_ogbg-molhiv.yaml`	GitHub	APLoss / SOAP	Average-precision (AUPRC) optimisation on ogbg-molhiv, fine-tuned from Stage 1
`config_opauc_ogbg-molhiv.yaml`	GitHub	pAUCLoss (1w) / SOPAs	One-way partial AUROC (FPR ≤ 0.3) on ogbg-molhiv, fine-tuned from Stage 1
`config_tpauc_ogbg-molhiv.yaml`	GitHub	pAUCLoss (2w) / SOTAs	Two-way partial AUROC on ogbg-molhiv, fine-tuned from Stage 1

Config Reference

Supported Datasets

The dataset.name key selects the dataset and dataset.kwargs passes dataset-specific arguments to the loader. See libauc.trainer.data.datasets.load_dataset for the full implementation.

`name`	`eval_splits`	`kwargs` (all optional unless marked required)
`cifar10`	`val`, `test`	`root_path` (str, default `./data`); `imratio` (float, default `0.1`) — positive-class ratio after resampling
`chexpert`	`val`, `test`	`root_path` (str, default `./data`); `val_size` (float, default `0.05`) — fraction of training data held out for validation
`pneumoniamnist`	`val`, `test`	`root_path` (str, default `./data`)
`breastmnist`	`val`, `test`	`root_path` (str, default `./data`)
`chestmnist`	`val`, `test`	`root_path` (str, default `./data`); `task` (int or null) — column index to use as the binary label (null = multi-label)
`melanoma`	`val`, `test`	`root_path` (str, default `./data`)
`ddsm`	`val`, `test`	`root_path` (str, default `./data`); `val_size` (float, default `0.1`); `image_size` (int, default `224`)
`ogbg-molhiv`	`val`, `test`	`root_path` (str, default `./data`)
`ogbg-moltox21`	`val`, `test`	`root_path` (str, default `./data`)
`ogbg-molmuv`	`val`, `test`	`root_path` (str, default `./data`)
`ogbg-molpcba`	`val`, `test`	`root_path` (str, default `./data`)

Supported Models

`model.name`	Notes
`resnet18`	Supports `pretrained_remote: true` (ImageNet weights from libauc hub)
`resnet20`	Lightweight; designed for CIFAR-scale inputs
`densenet121`	Supports `pretrained_remote: true`; good for chest X-ray tasks
`<org>/<model>` (any HF repo ID)	Loaded via `AutoModelForImageClassification.from_pretrained`. `AutoImageProcessor` is applied automatically in the DataLoader — the dataset needs no HF-specific transforms. Examples: `google/vit-base-patch16-224`, `openai/clip-vit-base-patch32`, `microsoft/resnet-50`.

When pretrained: true, the Trainer loads a local checkpoint from pretrained_path and resets the final classification head (fc, linear, classifier, or head) so it can be retrained for the new loss.

When pretrained_remote: true, ImageNet weights are downloaded via libauc’s model hub (independent of pretrained). Ignored for HF models.

Supported Metrics

`metrics` entry	Behaviour
`AUROC`	Full-ROC AUROC. With `metric_kwargs: [{max_fpr: 0.3}]` or `{min_tpr: 0.8}` it reports partial AUROC instead.
`AUPRC`	Area under the precision-recall curve.
`ACC`	Accuracy at threshold 0.5.

Example — track both ACC and partial AUROC simultaneously:

metrics:
  - ACC
  - AUROC

metric_kwargs:
  - {}                  # no kwargs need for ACC
  - {max_fpr: 0.3}      # pAUROC at FPR ≤ 0.3

Training Field Reference

Field	Default	Description
`project_name`	`libauc`	Weights & Biases project name
`experiment_name`	`run`	Run name; also used as the checkpoint sub-directory
`SEED`	`42`	Global random seed (NumPy, PyTorch, cuDNN)
`epochs`	`50`	Total training epochs
`batch_size`	`128`	Mini-batch size during training
`eval_batch_size`	`128`	Mini-batch size during evaluation
`sampling_rate`	`0.5`	Positive-class fraction fed to `DualSampler` / `TriSampler` per batch
`num_workers`	`2`	DataLoader worker processes
`decay_epochs`	`[]`	Epochs at which the LR / regulariser is decayed. Floats are multiplied by `epochs` (e.g. `0.5` → epoch 50)
`loss`	`AUCMLoss`	Loss class name (see pairings table above)
`loss_kwargs`	`{}`	Extra keyword arguments forwarded to the loss constructor
`optimizer`	`PESG`	Optimizer class name
`optimizer_kwargs`	`{}`	Extra keyword arguments forwarded to the optimizer constructor
`output_path`	`./output`	Root directory for checkpoints
`resume_from_checkpoint`	`true`	Resume from the latest checkpoint in `output_path/experiment_name`
`save_checkpoint_every`	`5`	Save a checkpoint every N epochs
`verbose`	`1`	`0` = silent; `1` = tqdm progress bar; `2` = one line per epoch

Extending the Trainer

Adding a New Dataset

All dataset logic lives in a single function: libauc/trainer/data/datasets.py → load_dataset().

Add a new elif branch that returns (train_dataset, eval_datasets). Both must be torch.utils.data.Dataset subclasses whose __getitem__ yields (data, label, index) tuples.

# libauc/trainer/data/datasets.py

def load_dataset(name: str, splits: List[str], **kwargs) -> Dataset:
    ...
    elif name == "mydata":
        root = kwargs.get("root_path", "./data")

        # Build train / eval datasets here.
        # __getitem__ must return (data, label, index).
        train_dataset = MyTrainDataset(root=root)

        eval_datasets = []
        for split in splits:
            if split == "val":
                eval_datasets.append(MyValDataset(root=root))
            elif split == "test":
                eval_datasets.append(MyTestDataset(root=root))
            else:
                raise NotImplementedError(
                    f"Split '{split}' not supported for 'mydata'."
                )
        return train_dataset, eval_datasets
    ...

Then reference it in your config:

dataset:
  name: mydata
  eval_splits: [val, test]
  kwargs:
    root_path: /path/to/mydata

Tip

Use the built-in IndexedDataset wrapper to add the required index return value to any standard torchvision dataset:

from libauc.trainer.data.datasets import IndexedDataset
import torchvision.datasets as tvd

train_dataset = IndexedDataset(
    tvd.ImageFolder(root=os.path.join(root, "train"), transform=train_transform)
)

Adding a New Model

Model construction is handled in libauc/trainer/core/image_trainer.py → Trainer._build_model().

Add a new elif branch that instantiates your model and assigns it to self.model:

# libauc/trainer/core/image_trainer.py

def _build_model(self, model_cfg: dict):
    name             = model_cfg.get("name", "").lower()
    pretrained       = model_cfg.get("pretrained", False)
    pretrained_remote = model_cfg.get("pretrained_remote", False)

    if name == "resnet18":
        ...
    elif name == "mymodel":
        from mypackage import MyModel
        model = MyModel(model_cfg)
    else:
        raise ValueError(f"Unknown model '{name}'.")

    model = model.cuda()

    if pretrained:
        pretrained_path = model_cfg.get("pretrained_path")
        state_dict = torch.load(pretrained_path, weights_only=False)
        if "model_state_dict" in state_dict:
            state_dict = state_dict["model_state_dict"]
        # Strip the classification head (fc / linear / classifier / head)
        # so it is re-initialised for the new task.
        filtered = {
            k: v for k, v in state_dict.items()
            if not any(part in k.split(".") for part in _HEAD_PARTS)
        }
        model.load_state_dict(filtered, strict=False)
        if hasattr(model, "fc"):
            model.fc.reset_parameters()

    self.model = model

Then reference it in your config:

model:
  name: mymodel

Note

The model’s final layer must output raw logits (no sigmoid / softmax). Pass last_activation=None when using libauc built-in models, as the loss functions apply their own activation internally.

Adding a Custom Trainer

If the built-in dataset/model extension points are not enough — for example you need a custom forward pass, auxiliary losses, mixed-precision training, or a completely different sampling strategy — you can subclass Trainer and override only the methods you need, exactly as GraphTrainer does for graph data.

Overridable hooks

Method	What to override it for
`_build_model(self, model_cfg)`	Construct a custom architecture; set `self.model`
`_construct_optimizer_and_loss(self, model, train_args)`	Wire a custom loss or optimizer; return `(loss_fn, optimizer)`
`_get_train_dataloader(self, train_args)`	Replace the default `DualSampler` / `DataLoader` setup
`_get_eval_dataloader(self, dataset, train_args)`	Use a different evaluation loader (e.g. graph batching)
`train(self)`	Fully customise the training loop (mixed-precision, grad clipping, …)

HuggingFace Integration

The Trainer has built-in support for any HuggingFace image-classification model — no code changes are required. Set model.name to a HuggingFace repo ID (any string containing /) and the Trainer will:

Download the model via AutoModelForImageClassification.from_pretrained.
Reinitialize the classifier head to match your task (binary or multi-label).
Download the AutoImageProcessor and apply it inside the DataLoader collate function — your dataset needs no HF-specific transforms.

Quick Start — HuggingFace Model

A single line in your YAML is all that is needed:

model:
  name: google/vit-base-patch16-224   # any HuggingFace repo ID

Understanding the Classifier-Head Mismatch Log

When you load a pretrained checkpoint the Trainer prints a brief load report:

classifier.weight | MISMATCH | Reinit — ckpt: torch.Size([1000, 768]) vs model: torch.Size([1, 768])
classifier.bias   | MISMATCH | Reinit — ckpt: torch.Size([1000]) vs model: torch.Size([1])

This is expected and correct. The upstream checkpoint was trained on ImageNet-1k (1 000 classes); your task uses a single binary output.

How Preprocessing Works

For HuggingFace models, an AutoImageProcessor is downloaded alongside the model weights and applied inside the DataLoader’s collate_fn, on CPU in a worker process, before the batch reaches the GPU. The processor handles:

Resize to the model’s expected resolution (e.g. 224 × 224 for ViT-Base, 224 × 224 for CLIP)
Normalisation with the model-specific mean / std

Because preprocessing happens in the DataLoader, your dataset’s __getitem__ only needs a plain transforms.ToTensor() — no transforms.Resize or transforms.Normalize. Single-channel (grayscale) images are automatically expanded to 3 channels before the processor is called.

Note

Images entering the DataLoader should be float32 tensors in the [0, 1] range — i.e. the output of torchvision.transforms.ToTensor(). Do not apply any additional resize or normalise transforms in the dataset; the processor will handle that.

Two-Stage AUROC Training with a HuggingFace Model

The same BCE → AUCMLoss workflow works with HF models. Create vit_ce.yaml (Stage 1 — BCE warm-up):

dataset:
  name: cifar10
  eval_splits: [val, test]

model:
  name: google/vit-base-patch16-224   # replace with any HF repo ID

metrics:
  - AUROC

training:
  experiment_name: vit_bce_cifar10
  epochs: 10
  batch_size: 32
  eval_batch_size: 64
  sampling_rate: 0.5
  num_workers: 4
  decay_epochs: [0.6, 0.9]

  loss: BCELoss
  loss_kwargs: {}

  optimizer: Adam
  optimizer_kwargs:
    lr: 2.0e-5              # small LR for fine-tuning a pretrained model
    weight_decay: 1.0e-4

  output_path: ./output
  save_checkpoint_every: 5
  verbose: 1

Create vit_auc.yaml (Stage 2 — AUROC optimization, fine-tune from Stage 1):

dataset:
  name: cifar10
  eval_splits: [val, test]

model:
  name: google/vit-base-patch16-224
  pretrained: true
  pretrained_path: "./output/vit_bce_cifar10/epoch_10.pt"

metrics:
  - AUROC

training:
  experiment_name: vit_AUCMLoss_cifar10
  epochs: 20
  batch_size: 32
  eval_batch_size: 64
  sampling_rate: 0.2
  num_workers: 4
  decay_epochs: [0.5, 0.75]

  loss: AUCMLoss
  loss_kwargs:
    margin: 1.0

  optimizer: PESG
  optimizer_kwargs:
    lr: 0.05
    epoch_decay: 0.002
    weight_decay: 1.0e-5
    momentum: 0.9

  output_path: ./output
  save_checkpoint_every: 5
  verbose: 1

Run both stages:

python -m libauc.trainer.run_image_trainer --config_file vit_ce.yaml
python -m libauc.trainer.run_image_trainer --config_file vit_auc.yaml

Supported HuggingFace Models

Any repo ID accepted by AutoModelForImageClassification works. A few tested examples:

HF repo ID	Notes
`google/vit-base-patch16-224`	ViT-Base (12 layers, 768 hidden) — general purpose
`openai/clip-vit-base-patch32`	CLIP ViT with rich multi-modal features
`microsoft/resnet-50`	HuggingFace-wrapped ResNet-50
`google/efficientnet-b0`	EfficientNet-B0 from HF Hub
`Ahmed9275/Vit-Cifar100`	ViT fine-tuned on CIFAR-100 — good warm start for CIFAR tasks

Expected Outputs

After both stages finish, your output directory will look like:

Built-in model (ResNet-18):

./output/
├── resnet18_ce_cifar10/
│   ├── epoch_5.pt
│   ├── ...
│   └── epoch_100.pt          ← loaded by aucmloss_config.yaml as pretrained_path
└── resnet18_AUCMLoss_cifar10/
    ├── epoch_5.pt
    ├── ...
    └── epoch_100.pt

HuggingFace model (ViT-Base):

./output/
├── vit_bce_cifar10/
│   ├── epoch_5.pt
│   ├── ...
│   └── epoch_10.pt           ← loaded by vit_auc.yaml as pretrained_path
└── vit_AUCMLoss_cifar10/
    ├── epoch_5.pt
    ├── ...
    └── epoch_20.pt

Validation and test AUROC scores are printed after every evaluation epoch.