libauc.trainer

The trainer API provides a high-level training interface for classification tasks, including YAML-driven training entry points, callback-based loops, and dataset/model/loss/optimizer orchestration.

Training Pipeline

Module and Class Overview

Module	Key classes / functions
`libauc.trainer.helpers`	`build_metric()`
`libauc.trainer.run_trainer`	CLI entry point for image / tabular tasks
`libauc.trainer.run_gnn`	CLI entry point for GNN tasks
`libauc.trainer.core.callbacks`	`TrainerCallback`, `CLICallback`, `CallbackHandler`, `TrainerState`
`libauc.trainer.core.trainer`	`Trainer`
`libauc.trainer.core.gnn_trainer`	`GNNTrainer`
`libauc.trainer.config.args`	`TrainingArguments`
`libauc.trainer.config.spaces`	`parse_defaultconfig()`
`libauc.trainer.data.datasets`	`load_dataset()`, `IndexedDataset`

libauc.trainer.helpers

build_metric(metric_names, metric_kwargs)[source]

Build an evaluation function from a list of metric names.

The returned callable computes each requested metric after every evaluation epoch and returns the results as a flat dict. It does not affect the training objective — losses and optimizers are configured separately via TrainingArguments.

Supported metric names (case-insensitive):

"AUROC" — full-ROC AUROC (libauc.metrics.auc_roc_score). When metric_kwargs for this entry contains max_fpr or min_tpr, partial AUROC (libauc.metrics.pauc_roc_score) is computed instead and the result key becomes e.g. "PAUROC(max_fpr=0.3)".
"AUPRC" — area under the precision-recall curve (libauc.metrics.auc_prc_score).
"ACC" — accuracy at a fixed decision threshold of 0.5 (sklearn.metrics.accuracy_score).

Unknown names are skipped with a warning. The same name may appear more than once with different metric_kwargs entries (e.g. full AUROC and partial AUROC simultaneously).

Parameters:

metric_names (list[str]) – Ordered list of metric names to compute, e.g. ["AUROC", "AUPRC", "ACC"].
metric_kwargs (list[dict]) – Per-metric keyword arguments. metric_kwargs[i] is forwarded to the computation function for metric_names[i]. Missing entries default to {}.

Returns:

A function metric_fn(test_true, test_pred) -> results where test_true is the 1-D array of ground-truth labels, test_pred is the 1-D array of model output scores, and results maps each metric name to its score.

Return type:

Callable[[numpy.ndarray, numpy.ndarray], dict[str, float]]

libauc.trainer.run_trainer

Entry point for image / tabular classification training. It reads a YAML config file, merges it with built-in defaults, builds all components (dataset, metric function, TrainingArguments), and hands off to Trainer.

python -m libauc.trainer.run_trainer --config_file config.yaml

Any TrainingArguments field can be overridden directly on the command line, e.g. --epochs 50 --batch_size 64.

libauc.trainer.run_gnn

Entry point for graph neural network (GNN) training. Mirrors run_trainer but uses GNNTrainer and a GNN-specific model config (emb_dim, num_layers, etc.).

python -m libauc.trainer.run_gnn --config_file config.yaml

libauc.trainer.core.callbacks

class CLICallback[source]

Console and Weights & Biases logging callback.

On on_train_begin it initialises a W&B run (silently falls back to console-only when W&B is not installed) and pretty-prints the full TrainingArguments config.

On on_epoch_end it:

appends a structured entry to state.train_log;
renders a progress bar (verbose=1) or a per-epoch line (verbose=2) to stdout;
ships the flat log dict to W&B via wandb.log.

On on_train_end it prints a training summary (best validation and test scores) and calls wandb.finish().

Note

This callback takes no constructor arguments. All configuration is read from TrainingArguments at runtime via the args parameter passed to each lifecycle hook.

Note

W&B logging is silently disabled when wandb is not installed or when wandb.log raises an exception.

Example:

>>> trainer = Trainer(..., callbacks=[CLICallback()])
>>> trainer.train()
============================================================
{'batch_size': 128, 'epochs': 50, ...}
============================================================
Epoch [██████████████████············] 20/50 | Loss: 0.3241 | AUROC: 0.8712 | LR: 0.100000

on_epoch_begin(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the beginning of an epoch.

on_epoch_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of an epoch.

on_step_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of a training step.

on_train_begin(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the beginning of training.

on_train_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of training.

class CallbackHandler(callbacks: List[TrainerCallback], model, optimizer, loss_fn)[source]

Multiplexer that owns a list of TrainerCallback instances and fans out every lifecycle event to each of them in registration order.

CallbackHandler itself inherits from TrainerCallback so it can be used polymorphically, but its primary role is orchestration rather than providing hook implementations of its own.

Parameters:

callbacks (list[TrainerCallback]) – Initial callback list.
model – The model being trained (forwarded to every callback via kwargs["model"]).
optimizer – The active optimizer (forwarded via kwargs["optimizer"]).
loss_fn – The active loss function (forwarded via kwargs["loss_fn"]).

Example:

>>> handler = CallbackHandler(
...     [CLICallback()],
...     model=model, optimizer=optimizer, loss_fn=loss_fn,
... )
>>> handler.on_train_begin(args, state)

add_callback(callback)[source]: Add a callback to the handler.

property callback_list: Get a string representation of all callbacks.

on_epoch_begin(args: TrainingArguments, state: TrainerState)[source]: Event called at the beginning of an epoch.

on_epoch_end(args: TrainingArguments, state: TrainerState, metrics, **kwargs)[source]: Event called at the end of an epoch.

on_evaluate(args: TrainingArguments, state: TrainerState)[source]: Event called after an evaluation phase.

on_init_end(args: TrainingArguments, state: TrainerState)[source]: Event called at the end of trainer initialization.

on_log(args: TrainingArguments, state: TrainerState, logs)[source]: Event called after logging the last logs.

on_predict(args: TrainingArguments, state: TrainerState, metrics)[source]: Event called after a successful prediction.

on_prediction_step(args: TrainingArguments, state: TrainerState)[source]: Event called after a prediction step.

on_save(args: TrainingArguments, state: TrainerState)[source]: Event called after a checkpoint save.

on_step_begin(args: TrainingArguments, state: TrainerState)[source]: Event called at the beginning of a training step.

on_step_end(args: TrainingArguments, state: TrainerState)[source]: Event called at the end of a training step.

on_substep_end(args: TrainingArguments, state: TrainerState)[source]: Event called at the end of a substep during gradient accumulation.

on_train_begin(args: TrainingArguments, state: TrainerState)[source]: Event called at the beginning of training.

on_train_end(args: TrainingArguments, state: TrainerState)[source]: Event called at the end of training.

pop_callback(callback)[source]: Remove and return a callback.

remove_callback(callback)[source]: Remove a callback without returning it.

class DefaultCallback[source]

Default callback with basic functionality.

on_epoch_begin(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the beginning of an epoch.

on_epoch_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of an epoch.

on_step_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of a training step.

class TrainerCallback[source]

Base class for training lifecycle callbacks.

Every method is a no-op by default, so subclasses only need to override the hooks they care about. Instances are registered with CallbackHandler, which calls each hook in registration order and forwards a consistent set of keyword arguments (model, optimizer, loss_fn, plus any extra kwargs the Trainer supplies for that event).

Lifecycle order during a typical training run:

on_init_end
on_train_begin
  for each epoch:
    on_epoch_begin
      for each step:
        on_step_begin
        on_step_end
    on_evaluate
    on_epoch_end
  [on_save — called periodically inside the epoch loop]
on_train_end

All callback methods are optional and can be overridden in subclasses.

Example:

>>> class MyCallback(TrainerCallback):
...     def on_epoch_end(self, args, state, **kwargs):
...         print(f"Epoch {state.epoch} done, loss={kwargs['train_loss']:.4f}")
...
>>> trainer = Trainer(..., callbacks=[MyCallback()])

on_epoch_begin(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the beginning of an epoch.

on_epoch_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of an epoch.

on_evaluate(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called after an evaluation phase.

on_init_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of trainer initialization.

on_log(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called after logging the last logs.

on_predict(args: TrainingArguments, state: TrainerState, metrics, **kwargs)[source]: Event called after a successful prediction.

on_prediction_step(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called after a prediction step.

on_save(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called after a checkpoint save.

on_step_begin(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the beginning of a training step.

on_step_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of a training step.

on_substep_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of a substep during gradient accumulation.

on_train_begin(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the beginning of training.

on_train_end(args: TrainingArguments, state: TrainerState, **kwargs)[source]: Event called at the end of training.

class TrainerState[source]: State object to track training progress.

libauc.trainer.core.trainer

class Trainer(train_args: TrainingArguments, model_cfg: dict, train_dataset: Dataset, eval_dataset: List[Dataset] | None = None, metric: Callable[[Tensor, Tensor], Mapping[str, float]] | None = None, callbacks: List[TrainerCallback] | None = None)[source]

Full training loop for image-classification models supported by libauc.

Trainer wires together a model, an AUC-aware loss function, a libauc optimizer, dual/tri-sampled data loaders, and an optional evaluation pipeline behind a unified train() entry point. Progress is surfaced through a CallbackHandler so any number of TrainerCallback subclasses can observe or alter the training loop without touching Trainer internals.

The class is intentionally thin: heavy lifting (data loading, model construction, loss/optimizer instantiation) is delegated to private helpers so subclasses like GNNTrainer can override only the parts they need.

Parameters:

train_args (TrainingArguments) – Fully populated training configuration produced by TrainingArguments.
model_cfg (dict) – Architecture config forwarded to _build_model(). Must contain at least a "name" key matching one of the registered architectures (resnet20, resnet18, densenet121).
train_dataset (Dataset) – PyTorch Dataset for the training split. Must expose a .targets attribute (list or array of labels).
eval_dataset (list[Dataset], optional) – One or more evaluation datasets. None disables evaluation (default: None).
metric (callable, optional) – (y_true, y_pred) -> dict[str, float] function returned by build_metric(). None disables metric computation (default: None).
callbacks (list[TrainerCallback], optional) – Callbacks invoked at every lifecycle hook. When None the handler is created with an empty list (default: None).

Example:

>>> from trainer.config.args import TrainingArguments
>>> from trainer.core.trainer import Trainer
>>> from trainer.core.callbacks import CLICallback
>>> train_args = TrainingArguments(
...     optimizer="PESG", optimizer_kwargs={"lr": 0.1},
...     loss="AUCMLoss", loss_kwargs={"margin": 1.0},
...     SEED=42, batch_size=128, eval_batch_size=128,
...     sampling_rate=0.5, epochs=50, decay_epochs=[],
...     num_workers=2, output_path="./output", num_tasks=1,
...     resume_from_checkpoint=False, save_checkpoint_every=5,
...     project_name="libauc", experiment_name="demo", verbose=1,
... )
>>> trainer = Trainer(
...     train_args=train_args,
...     model_cfg={"name": "resnet18", "num_classes": 1},
...     train_dataset=train_ds,
...     eval_dataset=[val_ds],
...     metric=metric_fn,
...     callbacks=[CLICallback()],
... )
>>> log = trainer.train()

add_callback(callback)[source]: Add a callback to the trainer.

evaluate(loader, model)[source]

Evaluate model on a given data loader.

Parameters:

loader – Data loader for evaluation
model – Model to evaluate

Returns:

Tuple of (dictionary of evaluation metrics, test_true, test_pred)

evaluate_loop(model)[source]

Evaluate model on all evaluation datasets.

Parameters:: model – Model to evaluate
Returns:: Tuple of (dictionary of metrics from all evaluation datasets, test_true, test_pred) test_true and test_pred are from the first evaluation dataset, or None if no eval datasets

get_latest_checkpoint(output_path: str)[source]

load_checkpoint(checkpoint_path: str)[source]

save_checkpoint(checkpoint_path: str)[source]

train()[source]

Main training loop.

Returns:: List of training logs with metrics for each epoch

libauc.trainer.core.gnn_trainer

class GNNTrainer(train_args: TrainingArguments, model_cfg: dict, train_dataset, eval_dataset: List | None = None, metric: Callable[[...], Mapping[str, float]] | None = None, callbacks: List[TrainerCallback] | None = None, decay_epochs: List[int] | None = None, decay_factor: float = 10.0, train_eval_dataset=None)[source]

Training loop for graph neural networks built with libauc’s GNN model zoo.

GNNTrainer extends Trainer with graph-aware overrides:

_build_model() — looks up the requested GNN architecture in _GNN_REGISTRY and constructs it via libauc.models, then infers whether the model expects edge features (supports_edge_attr).
_get_train_dataloader() / _get_eval_dataloader() — use torch_geometric.loader.DataLoader instead of the standard PyTorch one, while keeping the same DualSampler for positive/negative balancing.
_forward() — dispatches to the correct GNN forward signature (with or without edge_attr).
train() — adds optional learning-rate decay at specified epochs via optimizer.update_lr.

Supported GNN architectures

gcn, gin, gine, graphsage, gat, mpnn, deepergcn, pna

param train_args:: Training configuration.
type train_args:: TrainingArguments
param model_cfg:: GNN model configuration. Required key: name (one of the architectures listed above). Optional keys: num_tasks (default 1), emb_dim (default 256), num_layers (default 5), graph_pooling, dropout, atom_features_dims, bond_features_dims, act, norm, jk, v2 (GAT-only), aggr / t / learn_t / p / learn_p / block (DeeperGCN-only), pretrained (bool), pretrained_path (str).
type model_cfg:: dict
param train_dataset:: PyG-compatible graph dataset (train split).
param eval_dataset:: PyG-compatible graph datasets for evaluation splits (default: None).
type eval_dataset:: list, optional
param metric:: (y_true, y_pred) -> dict[str, float]
type metric:: callable, optional
param callbacks:: Training callbacks.
type callbacks:: list[TrainerCallback], optional
param decay_epochs:: Epoch indices at which optimizer.update_lr(decay_factor=decay_factor) is called (default: no decay).
type decay_epochs:: list[int], optional
param decay_factor:: LR divisor at each decay epoch (default: 10.0).
type decay_factor:: float
param train_eval_dataset:: Optional dataset for an unbiased train-split evaluation; falls back to train_dataset when None.

Example:

>>> trainer = GNNTrainer(
...     train_args=train_args,
...     model_cfg={"name": "gin", "num_tasks": 1, "emb_dim": 300},
...     train_dataset=train_ds,
...     eval_dataset=[val_ds, test_ds],
...     metric=metric_fn,
...     callbacks=[CLICallback()],
...     decay_epochs=[100, 150],
...     decay_factor=10.0,
... )
>>> log = trainer.train()

evaluate(loader, model)[source]

Override base Trainer.evaluate() to use the GNN forward pass.

Parameters:

loader (PyG DataLoader)
model (GNN model)

Return type:

(metrics_dict, y_true, y_pred)

train()[source]

GNN training loop.

Steps each epoch:

Optional LR decay if epoch is in decay_epochs.
Forward / backward over the training loader.
Evaluation on the training split (unbiased loader).
Evaluation on all registered eval loaders.
Callbacks and periodic checkpointing.

Returns:: list
Return type:: training log produced by the state / callback system

libauc.trainer.config.args

class TrainingArguments(**kwargs)[source]

Container for all hyperparameters and settings that govern a single training run.

All fields map one-to-one to keys in the training section of a YAML config file and can be overridden from the CLI via apply_cli_overrides.

Parameters:

optimizer (str) – Name of the libauc optimizer class, e.g. "PESG", "PDSCA", "SOAP".
optimizer_kwargs (dict) – Extra keyword arguments forwarded verbatim to the optimizer constructor (e.g. lr, momentum, weight_decay).
loss (str) – Name of the loss-function class, e.g. "AUCMLoss", "CompositionalAUCLoss". Looked up first in libauc.losses, then torch.nn.
loss_kwargs (dict) – Extra keyword arguments forwarded verbatim to the loss constructor.
SEED (int) – Global random seed for NumPy, PyTorch and cuDNN (default: 42).
batch_size (int) – Mini-batch size for training (default: 128).
eval_batch_size (int) – Mini-batch size for evaluation (default: 128).
sampling_rate (float) – Positive-class sampling rate passed to DualSampler / TriSampler (default: 0.5).
epochs (int) – Total number of training epochs (default: 50).
decay_epochs (list) – Epoch indices (or fractional multiples of epochs) at which the learning-rate / regulariser is decayed. Floats are converted to int(f * epochs) at construction time.
num_workers (int) – Number of DataLoader worker processes (default: 2).
output_path (str) – Root directory for checkpoints and logs (default: "./output").
num_tasks (int) – Number of output tasks / classes. 1 → binary; ≥ 3 → multi-label with TriSampler.
resume_from_checkpoint (bool) – Whether to resume from the latest checkpoint found in output_path/experiment_name (default: True).
save_checkpoint_every (int) – Save a checkpoint every N epochs (default: 5).
project_name (str) – Weights & Biases project name (default: "libauc").
experiment_name (str) – Weights & Biases run name; also used as the checkpoint sub-directory.
verbose (int) – Verbosity level. 0 = silent; 1 = progress bar; 2 = one line per epoch (default: 1).

Example:

>>> args = TrainingArguments(
...     optimizer="PESG",
...     optimizer_kwargs={"lr": 0.1, "momentum": 0.9},
...     loss="AUCMLoss",
...     loss_kwargs={"margin": 1.0},
...     SEED=42,
...     batch_size=128,
...     eval_batch_size=128,
...     sampling_rate=0.5,
...     epochs=50,
...     decay_epochs=[],
...     num_workers=2,
...     output_path="./output",
...     num_tasks=1,
...     resume_from_checkpoint=True,
...     save_checkpoint_every=5,
...     project_name="libauc",
...     experiment_name="my_experiment",
...     verbose=1,
... )

parse_defaultconfig(type_name: str, multilabel: bool = False, kwargs: dict = {})[source]

Resolve a loss or optimizer name to its canonical {optimizer, loss} configuration dict by looking up the corresponding spaces class.

The mapping covers every loss/optimizer pair supported by libauc:

`type_name`	Space class
`AUCMLoss` / `PESG`	`AUCMLossSpace` (`MultiLabelAUCMLossSpace` when `multilabel=True`)
`CompositionalAUCLoss` / `PDSCA`	`CompositionalAUCLossSpace`
`APLoss` / `SOAP`	`APLossSpace` (`mAPLossSpace` when `multilabel=True`)
`pAUC_CVaR_Loss` / `SOPA` / `pAUCLoss` mode `SOPA`	`pAUC_CVaR_LossSpace` (`MultiLabel…` variant)
`pAUC_DRO_Loss` / `SOPAs` / `pAUCLoss` mode `1w`	`pAUC_DRO_LossSpace` (`MultiLabel…` variant)
`tpAUC_KL_Loss` / `SOTAs` / `pAUCLoss` mode `2w`	`tpAUC_KL_LossSpace` (`MultiLabel…` variant)
`tpAUC_CVaR_loss` / `STACO`	`tpAUC_CVaR_lossSpace`
`NDCGLoss` / `SONG`	`NDCGLossSpace`
`CrossEntropyLoss` / `SGD`	`SGDSpace`
`Adam`	`AdamSpace`
`BCELoss`	`BCELossSpace`

Parameters:

type_name (str) – Name of the loss or optimizer class.
multilabel (bool) – When True, selects the multi-label variant of the space if one exists (default: False).
kwargs (dict) – Additional keyword arguments for the loss/optimizer, used to disambiguate pAUCLoss by its mode key.

Returns:

{"optimizer": <optimizer_cfg>, "loss": <loss_cfg>} where each config is a dict with at least a "type" key and a "space" key containing the hyperparameter search space.

Return type:

dict

Raises:

ValueError – If type_name is not recognised.

Example:

>>> cfg = parse_defaultconfig("AUCMLoss", multilabel=False)
>>> cfg["optimizer"]["type"]
'PESG'
>>> cfg["loss"]["type"]
'AUCMLoss'

libauc.trainer.config.spaces

class APLossSpace[source]

loss = {'space': {'gamma': {'default': 0.9, 'val': (0.0, 1.0)}, 'margin': {'default': 1.0, 'val': [0.6, 0.8, 1.0]}}, 'type': 'APLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'SOAP'}

class AUCMLossSpace[source]

loss = {'space': {'margin': {'default': 1.0, 'val': [0.6, 0.8, 1.0]}}, 'type': 'AUCMLoss'}

optimizer = {'space': {'epoch_decay': {'default': 0.002, 'val': (0.0, 0.01)}, 'lr': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'PESG'}

class AdamSpace[source]

loss = {'space': {}, 'type': 'CrossEntropyLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'Adam'}

class BCELossSpace[source]

loss = {'space': {}, 'type': 'BCELoss'}

optimizer = {}

class CompositionalAUCLossSpace[source]

loss = {'space': {'k': {'default': 1, 'val': [1, 2, 4]}, 'margin': {'default': 1.0, 'val': [0.6, 0.8, 1.0]}}, 'type': 'CompositionalAUCLoss'}

optimizer = {'space': {'epoch_decay': {'default': 0.002, 'val': (0.0, 0.01)}, 'lr': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'PDSCA'}

class MultiLabelAUCMLossSpace[source]

loss = {'space': {'margin': {'default': 1.0, 'val': [0.6, 0.8, 1.0]}}, 'type': 'MultiLabelAUCMLoss'}

optimizer = {'space': {'epoch_decay': {'default': 0.002, 'val': (0.0, 0.01)}, 'lr': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'PESG'}

class MultiLabelpAUC_CVaR_LossSpace[source]

loss = {'space': {'beta': {'val': 0.2}, 'eta': {'default': 0.1, 'log': True, 'val': (0.01, 10)}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'mode': {'val': 'SOPA'}}, 'type': 'MultiLabelpAUCLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'SOPA'}

class MultiLabelpAUC_DRO_LossSpace[source]

loss = {'space': {'Lambda': {'default': 1.0, 'log': True, 'val': (0.1, 10.0)}, 'gamma': {'default': 0.9, 'val': (0.0, 1.0)}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'mode': {'val': 'SOPAs'}}, 'type': 'MultiLabelpAUCLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'SOPAs'}

class MultiLabeltpAUC_KL_LossSpace[source]

loss = {'space': {'Lambda': {'default': 1.0, 'log': True, 'val': (0.1, 10.0)}, 'gammas': {'default': (0.9, 0.9), 'val': [(0.1, 0.1), (0.5, 0.5), (0.9, 0.9)]}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'mode': {'val': 'SOTAs'}, 'tau': {'default': 1.0, 'log': True, 'val': (0.1, 10.0)}}, 'type': 'MultiLabelpAUCLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'SOTAs'}

class NDCGLossSpace[source]

loss = {'space': {'eta0': {'default': 0.01, 'log': True, 'val': (0.001, 0.1)}, 'gamma0': {'default': 0.9, 'val': (0.0, 1.0)}, 'gamma1': {'val': 0.9}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'sigmoid_alpha': {'default': 2.0, 'val': (1.0, 2.0)}}, 'type': 'NDCGLoss'}

optimizer = {'space': {'lr': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'SONG'}

class SGDSpace[source]

loss = {'space': {}, 'type': 'CrossEntropyLoss'}

optimizer = {'space': {'lr': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0, 'val': [0, 0.9]}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'SGD'}

class mAPLossSpace[source]

loss = {'space': {'gamma': {'default': 0.9, 'val': (0.0, 1.0)}, 'margin': {'default': 1.0, 'val': [0.6, 0.8, 1.0]}}, 'type': 'mAPLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'SOAP'}

class pAUC_CVaR_LossSpace[source]

loss = {'space': {'beta': {'val': 0.2}, 'eta': {'default': 0.1, 'log': True, 'val': (0.01, 10)}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'mode': {'val': 'SOPA'}}, 'type': 'pAUCLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'SOPA'}

class pAUC_DRO_LossSpace[source]

loss = {'space': {'Lambda': {'default': 1.0, 'log': True, 'val': (0.1, 10.0)}, 'gamma': {'default': 0.9, 'val': (0.0, 1.0)}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'mode': {'val': 'SOPAs'}}, 'type': 'pAUCLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 1e-05, 'val': (0.0, 0.0002)}}, 'type': 'SOPAs'}

class tpAUC_CVaR_lossSpace[source]

loss = {'space': {'alpha': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'beta_0': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'beta_1': {'default': 0.1, 'log': True, 'val': (0.0001, 0.1)}, 'theta_0': {'default': 0.5, 'val': [0.3, 0.5, 0.7]}, 'theta_1': {'default': 0.5, 'val': [0.3, 0.5, 0.7]}, 'threshold': {'default': 0.5, 'val': [0.3, 0.5, 0.7]}}, 'type': 'tpAUC_CVaR_loss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.01)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'STACO'}

class tpAUC_KL_LossSpace[source]

loss = {'space': {'Lambda': {'default': 1.0, 'log': True, 'val': (0.1, 10.0)}, 'gammas': {'default': (0.9, 0.9), 'val': [(0.1, 0.1), (0.5, 0.5), (0.9, 0.9)]}, 'margin': {'default': 1.0, 'val': [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}, 'mode': {'val': 'SOTAs'}, 'tau': {'default': 1.0, 'log': True, 'val': (0.1, 10.0)}}, 'type': 'pAUCLoss'}

optimizer = {'space': {'lr': {'default': 0.001, 'log': True, 'val': (0.0001, 0.1)}, 'momentum': {'default': 0.9, 'val': (0.8, 0.99)}, 'weight_decay': {'default': 0, 'val': (0.0, 0.0002)}}, 'type': 'SOTAs'}

libauc.trainer.data.datasets

class ChemicalDataset(dataset, class_id)[source]

class GraphDataset(name, root='dataset', transform=None, pre_transform=None, meta_dict=None)[source]

class ImageDataset(images, targets, image_size=32, crop_size=30, mode='train')[source]

class IndexedDataset(dataset, class_id=None)[source]

class MedicalImageCSVDataset(csv_path: str, image_root: str, image_col: str, label_col: str, transform)[source]

General-purpose CSV-backed medical image dataset.

Expects a CSV with at least an image path column and a binary label column. Image paths in the CSV may be relative (resolved against image_root) or absolute.

Parameters:

csv_path – Path to the metadata CSV.
image_root – Directory that image paths are resolved against when they are not absolute. Ignored for absolute paths.
image_col – Column name containing the image filename / path.
label_col – Column name containing the binary label (0 / 1).
transform – torchvision transform applied to each PIL image.

class TextDataset(dataframe, text_col, label_col)[source]

load_dataset(name: str, splits: List[str], **kwargs) → Dataset[source]

Load a dataset by name and split.

Parameters:

name – Dataset identifier (e.g. “catvsdog”, “chexpert”).
splits – Evaluation splits.
**kwargs – Extra dataset-specific keyword arguments from the config.

Returns:

A torch.utils.data.Dataset whose __getitem__ yields (data, label, index) tuples, as expected by the Trainer.

TODO: Implement each dataset branch below.