.. _libauc_trainer: Train with LibAUC Trainer ================================================================================================================================ .. raw:: html
View on Github
------------------------------------------------------------------------------------ .. container:: cell markdown | **Author**: Siqi Guo, Gang Li, Tianbao Yang \ Introduction ------------------------------------------------------------------------------------ The **LibAUC Trainer** is a high-level training interface that turns a YAML config file into a complete training run — data loading, model construction, loss/optimizer wiring, evaluation, and checkpointing are all handled automatically. This tutorial walks through a **two-stage workflow** for AUROC maximization: .. list-table:: :widths: 10 30 60 :header-rows: 1 * - Stage - Goal - Config * - 1 - Warm-up with cross-entropy pretraining - ``ce_config.yaml`` (``BCELoss`` + ``Adam``) * - 2 - Fine-tune for AUROC with AUC-aware loss - ``aucmloss_config.yaml`` (``AUCMLoss`` + ``PESG``) .. note:: Pretraining first and then switching to an AUC loss typically yields **higher AUROC** than training with the AUC loss from scratch, because the model starts from a meaningful feature representation. Quick Start ------------------------------------------------------------------------------------ .. code:: bash # Stage 1 — cross-entropy warm-up python -m libauc.trainer.run_image_trainer --config_file ce_config.yaml # Stage 2 — AUROC optimization python -m libauc.trainer.run_image_trainer --config_file aucmloss_config.yaml Any field in the YAML can be overridden directly on the command line: .. code:: bash python -m libauc.trainer.run_image_trainer --config_file aucmloss_config.yaml \ --epochs 50 --batch_size 64 --sampling_rate 0.3 Supported Loss / Optimizer Pairings ------------------------------------------------------------------------------------ The ``loss`` and ``optimizer`` fields must be a compatible pair. .. list-table:: :widths: 40 30 30 :header-rows: 1 * - Task - Loss - Optimizer * - AUROC (binary) - ``AUCMLoss`` - ``PESG`` * - AUROC (multi-label) - ``MultiLabelAUCMLoss`` *(auto-selected)* - ``PESG`` * - Compositional AUROC - ``CompositionalAUCLoss`` - ``PDSCA`` * - Average Precision - ``APLoss`` - ``SOAP`` * - Partial AUROC (CVaR) - ``pAUC_CVaR_Loss`` / ``pAUCLoss`` (mode: ``SOPA``) - ``SOPA`` * - Partial AUROC (DRO) - ``pAUC_DRO_Loss`` / ``pAUCLoss`` (mode: ``1w``) - ``SOPAs`` * - Two-way partial AUROC (KL) - ``tpAUC_KL_Loss`` / ``pAUCLoss`` (mode: ``2w``) - ``SOTAs`` * - Two-way partial AUROC (CVaR) - ``tpAUC_CVaR_loss`` - ``STACO`` * - NDCG - ``NDCGLoss`` - ``SONG`` * - CE pretraining (SGD) - ``CrossEntropyLoss`` - ``SGD`` * - CE pretraining (Adam) - ``BCELoss`` - ``Adam`` For parameter details see the `libauc.optimizers API reference `_. Two-Stage Training Walkthrough ------------------------------------------------------------------------------------ Step 1: CE Pretraining ~~~~~~~~~~~~~~~~~~~~~~ Create ``ce_config.yaml``: .. code-block:: yaml # ── Dataset: which dataset to load, splits to evaluate, and class-imbalance ratio ── dataset: name: cifar10 # see "Supported Datasets" below eval_splits: [val, test] # ── Model: architecture, weight initialization, and output format ────────────────── model: name: resnet18 # see "Supported Models" below pretrained: false # ── Metrics: evaluation metrics computed on every eval split ─────────────────────── metrics: - AUROC # see "Supported Metrics" below # ── Training: hyperparameters, loss, optimizer, and checkpointing ────────────────── training: # Experiment metadata project_name: libauc experiment_name: resnet18_ce_cifar10 SEED: 2026 # Data loading epochs: 100 batch_size: 128 eval_batch_size: 256 sampling_rate: 0.5 # positive fraction per batch (DualSampler) num_workers: 2 decay_epochs: [0.5, 0.75] # fractions of total epochs, or absolute ints # Loss loss: BCELoss loss_kwargs: {} # Optimizer optimizer: Adam optimizer_kwargs: lr: 1.0e-3 weight_decay: 1.0e-4 # Output and checkpointing output_path: ./output resume_from_checkpoint: false save_checkpoint_every: 5 verbose: 1 # 0 = silent | 1 = progress bar | 2 = one line/epoch Run it: .. code:: bash python -m libauc.trainer.run_image_trainer --config_file ce_config.yaml Expected checkpoint path after training: .. code:: text ./output/resnet18_ce_cifar10/epoch_100.pt Step 2: AUROC Optimization ~~~~~~~~~~~~~~~~~~~~~~~~~~ Create ``aucmloss_config.yaml``: .. code-block:: yaml # ── Dataset: same split and imbalance ratio as Stage 1 ──────────────────────────── dataset: name: cifar10 eval_splits: [val, test] # ── Model: fine-tune from the Stage-1 checkpoint ────────────────────────────────── model: name: resnet18 pretrained: true pretrained_path: "./output/resnet18_ce_cifar10/epoch_100.pt" # ── Metrics: evaluation metrics computed on every eval split ─────────────────────── metrics: - AUROC # ── Training: AUCMLoss and PESG ──────────────────────────────────── training: # Experiment metadata project_name: libauc experiment_name: resnet18_AUCMLoss_cifar10 SEED: 2026 # Data loading epochs: 100 batch_size: 128 eval_batch_size: 256 sampling_rate: 0.2 num_workers: 2 decay_epochs: [0.5, 0.75] # Loss loss: AUCMLoss loss_kwargs: # hyper-parameters for AUCMLoss margin: 1.0 # Optimizer optimizer: PESG optimizer_kwargs: # hyper-parameters for PESG lr: 0.1 epoch_decay: 0.002 weight_decay: 1.0e-5 momentum: 0.9 # Output and checkpointing output_path: ./output resume_from_checkpoint: false save_checkpoint_every: 5 verbose: 1 Run it: .. code:: bash python -m libauc.trainer.run_image_trainer --config_file aucmloss_config.yaml Recipe Library ------------------------------------------------------------------------------------ Ready-to-run config files for common tasks. Download a file and pass it directly to the CLI entry point: .. code:: bash python -m libauc.trainer.run_image_trainer --config_file python -m libauc.trainer.run_graph_trainer --config_file **Image classifier recipes** .. list-table:: :widths: 28 10 17 45 :header-rows: 1 * - Config - View on Github - Loss / Optimizer - Description * - :download:`config_ce_Cifar10.yaml <../trainer_recipes/image_trainer/config_ce_Cifar10.yaml>` - `GitHub `__ - BCELoss / Adam - Stage 1 warm-up on imbalanced CIFAR-10 (ResNet-18, imratio 0.1) * - :download:`config_auc_Cifar10.yaml <../trainer_recipes/image_trainer/config_auc_Cifar10.yaml>` - `GitHub `__ - AUCMLoss / PESG - Stage 2 binary AUROC maximisation on CIFAR-10, fine-tuned from Stage 1 * - :download:`config_auprc_Cifar10.yaml <../trainer_recipes/image_trainer/config_auprc_Cifar10.yaml>` - `GitHub `__ - APLoss / SOAP - Average-precision (AUPRC) optimisation on CIFAR-10, fine-tuned from Stage 1 * - :download:`config_opauc_Cifar10.yaml <../trainer_recipes/image_trainer/config_opauc_Cifar10.yaml>` - `GitHub `__ - pAUCLoss (1w) / SOPAs - One-way partial AUROC (FPR ≤ 0.3) on CIFAR-10, fine-tuned from Stage 1 * - :download:`config_tpauc_Cifar10.yaml <../trainer_recipes/image_trainer/config_tpauc_Cifar10.yaml>` - `GitHub `__ - pAUCLoss (2w) / SOTAs - Two-way partial AUROC on CIFAR-10, fine-tuned from Stage 1 * - :download:`config_compositional_auc_Cifar10.yaml <../trainer_recipes/image_trainer/config_compositional_auc_Cifar10.yaml>` - `GitHub `__ - CompositionalAUCLoss / PDSCA - Compositional AUROC on CIFAR-10 from scratch (ResNet-20) * - :download:`config_ce_BreastMNIST.yaml <../trainer_recipes/image_trainer/config_ce_BreastMNIST.yaml>` - `GitHub `__ - BCELoss / Adam - Stage 1 warm-up on BreastMNIST (binary, grayscale ultrasound) * - :download:`config_auc_BreastMNIST.yaml <../trainer_recipes/image_trainer/config_auc_BreastMNIST.yaml>` - `GitHub `__ - AUCMLoss / PESG - Stage 2 AUROC maximisation on BreastMNIST, fine-tuned from Stage 1 * - :download:`config_ce_PneumoniaMNIST.yaml <../trainer_recipes/image_trainer/config_ce_PneumoniaMNIST.yaml>` - `GitHub `__ - BCELoss / Adam - Stage 1 warm-up on PneumoniaMNIST (binary, chest X-ray) * - :download:`config_auc_PneumoniaMNIST.yaml <../trainer_recipes/image_trainer/config_auc_PneumoniaMNIST.yaml>` - `GitHub `__ - AUCMLoss / PESG - Stage 2 AUROC maximisation on PneumoniaMNIST, fine-tuned from Stage 1 * - :download:`config_auc_ChestMNIST.yaml <../trainer_recipes/image_trainer/config_auc_ChestMNIST.yaml>` - `GitHub `__ - MultiLabelAUCMLoss / PESG - Multi-label AUROC on ChestMNIST (14 thorax disease labels) * - :download:`config_auc_melanoma.yaml <../trainer_recipes/image_trainer/config_auc_melanoma.yaml>` - `GitHub `__ - AUCMLoss / PESG - Binary AUROC on the melanoma dermoscopy dataset (ResNet-18) * - :download:`config_auc_ddsm.yaml <../trainer_recipes/image_trainer/config_auc_ddsm.yaml>` - `GitHub `__ - AUCMLoss / PESG - Binary AUROC on DDSM mammography (DenseNet-121, ImageNet warm-start) **Graph neural network recipes** .. list-table:: :widths: 28 10 17 45 :header-rows: 1 * - Config - View on Github - Loss / Optimizer - Description * - :download:`config_ce_ogbg-molhiv.yaml <../trainer_recipes/graph_trainer/config_ce_ogbg-molhiv.yaml>` - `GitHub `__ - BCELoss / Adam - Stage 1 warm-up on ogbg-molhiv (GINE, binary molecular property) * - :download:`config_auprc_ogbg-molhiv.yaml <../trainer_recipes/graph_trainer/config_auprc_ogbg-molhiv.yaml>` - `GitHub `__ - APLoss / SOAP - Average-precision (AUPRC) optimisation on ogbg-molhiv, fine-tuned from Stage 1 * - :download:`config_opauc_ogbg-molhiv.yaml <../trainer_recipes/graph_trainer/config_opauc_ogbg-molhiv.yaml>` - `GitHub `__ - pAUCLoss (1w) / SOPAs - One-way partial AUROC (FPR ≤ 0.3) on ogbg-molhiv, fine-tuned from Stage 1 * - :download:`config_tpauc_ogbg-molhiv.yaml <../trainer_recipes/graph_trainer/config_tpauc_ogbg-molhiv.yaml>` - `GitHub `__ - pAUCLoss (2w) / SOTAs - Two-way partial AUROC on ogbg-molhiv, fine-tuned from Stage 1 Config Reference ------------------------------------------------------------------------------------ Supported Datasets ~~~~~~~~~~~~~~~~~~ The ``dataset.name`` key selects the dataset and ``dataset.kwargs`` passes dataset-specific arguments to the loader. See `libauc.trainer.data.datasets.load_dataset `_ for the full implementation. .. list-table:: :widths: 25 20 55 :header-rows: 1 * - ``name`` - ``eval_splits`` - ``kwargs`` (all optional unless marked required) * - ``cifar10`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``); ``imratio`` (float, default ``0.1``) — positive-class ratio after resampling * - ``chexpert`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``); ``val_size`` (float, default ``0.05``) — fraction of training data held out for validation * - ``pneumoniamnist`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) * - ``breastmnist`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) * - ``chestmnist`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``); ``task`` (int or null) — column index to use as the binary label (null = multi-label) * - ``melanoma`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) * - ``ddsm`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``); ``val_size`` (float, default ``0.1``); ``image_size`` (int, default ``224``) * - ``ogbg-molhiv`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) * - ``ogbg-moltox21`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) * - ``ogbg-molmuv`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) * - ``ogbg-molpcba`` - ``val``, ``test`` - ``root_path`` (str, default ``./data``) Supported Models ~~~~~~~~~~~~~~~~ .. list-table:: :widths: 35 65 :header-rows: 1 * - ``model.name`` - Notes * - ``resnet18`` - Supports ``pretrained_remote: true`` (ImageNet weights from libauc hub) * - ``resnet20`` - Lightweight; designed for CIFAR-scale inputs * - ``densenet121`` - Supports ``pretrained_remote: true``; good for chest X-ray tasks * - ``/`` *(any HF repo ID)* - Loaded via ``AutoModelForImageClassification.from_pretrained``. ``AutoImageProcessor`` is applied automatically in the DataLoader — the dataset needs no HF-specific transforms. Examples: ``google/vit-base-patch16-224``, ``openai/clip-vit-base-patch32``, ``microsoft/resnet-50``. When ``pretrained: true``, the Trainer loads a local checkpoint from ``pretrained_path`` and resets the final classification head (``fc``, ``linear``, ``classifier``, or ``head``) so it can be retrained for the new loss. When ``pretrained_remote: true``, ImageNet weights are downloaded via libauc's model hub (independent of ``pretrained``). Ignored for HF models. Supported Metrics ~~~~~~~~~~~~~~~~~ .. list-table:: :widths: 20 80 :header-rows: 1 * - ``metrics`` entry - Behaviour * - ``AUROC`` - Full-ROC AUROC. With ``metric_kwargs: [{max_fpr: 0.3}]`` or ``{min_tpr: 0.8}`` it reports partial AUROC instead. * - ``AUPRC`` - Area under the precision-recall curve. * - ``ACC`` - Accuracy at threshold 0.5. Example — track both ACC and partial AUROC simultaneously: .. code-block:: yaml metrics: - ACC - AUROC metric_kwargs: - {} # no kwargs need for ACC - {max_fpr: 0.3} # pAUROC at FPR ≤ 0.3 Training Field Reference ~~~~~~~~~~~~~~~~~~~~~~~~ .. list-table:: :widths: 30 15 55 :header-rows: 1 * - Field - Default - Description * - ``project_name`` - ``libauc`` - Weights & Biases project name * - ``experiment_name`` - ``run`` - Run name; also used as the checkpoint sub-directory * - ``SEED`` - ``42`` - Global random seed (NumPy, PyTorch, cuDNN) * - ``epochs`` - ``50`` - Total training epochs * - ``batch_size`` - ``128`` - Mini-batch size during training * - ``eval_batch_size`` - ``128`` - Mini-batch size during evaluation * - ``sampling_rate`` - ``0.5`` - Positive-class fraction fed to ``DualSampler`` / ``TriSampler`` per batch * - ``num_workers`` - ``2`` - DataLoader worker processes * - ``decay_epochs`` - ``[]`` - Epochs at which the LR / regulariser is decayed. Floats are multiplied by ``epochs`` (e.g. ``0.5`` → epoch 50) * - ``loss`` - ``AUCMLoss`` - Loss class name (see pairings table above) * - ``loss_kwargs`` - ``{}`` - Extra keyword arguments forwarded to the loss constructor * - ``optimizer`` - ``PESG`` - Optimizer class name * - ``optimizer_kwargs`` - ``{}`` - Extra keyword arguments forwarded to the optimizer constructor * - ``output_path`` - ``./output`` - Root directory for checkpoints * - ``resume_from_checkpoint`` - ``true`` - Resume from the latest checkpoint in ``output_path/experiment_name`` * - ``save_checkpoint_every`` - ``5`` - Save a checkpoint every N epochs * - ``verbose`` - ``1`` - ``0`` = silent; ``1`` = tqdm progress bar; ``2`` = one line per epoch Extending the Trainer ------------------------------------------------------------------------------------ Adding a New Dataset ~~~~~~~~~~~~~~~~~~~~ All dataset logic lives in a single function: `libauc/trainer/data/datasets.py `_ → ``load_dataset()``. Add a new ``elif`` branch that returns ``(train_dataset, eval_datasets)``. Both must be ``torch.utils.data.Dataset`` subclasses whose ``__getitem__`` yields ``(data, label, index)`` tuples. .. code-block:: python # libauc/trainer/data/datasets.py def load_dataset(name: str, splits: List[str], **kwargs) -> Dataset: ... elif name == "mydata": root = kwargs.get("root_path", "./data") # Build train / eval datasets here. # __getitem__ must return (data, label, index). train_dataset = MyTrainDataset(root=root) eval_datasets = [] for split in splits: if split == "val": eval_datasets.append(MyValDataset(root=root)) elif split == "test": eval_datasets.append(MyTestDataset(root=root)) else: raise NotImplementedError( f"Split '{split}' not supported for 'mydata'." ) return train_dataset, eval_datasets ... Then reference it in your config: .. code-block:: yaml dataset: name: mydata eval_splits: [val, test] kwargs: root_path: /path/to/mydata .. tip:: Use the built-in ``IndexedDataset`` wrapper to add the required ``index`` return value to any standard ``torchvision`` dataset: .. code-block:: python from libauc.trainer.data.datasets import IndexedDataset import torchvision.datasets as tvd train_dataset = IndexedDataset( tvd.ImageFolder(root=os.path.join(root, "train"), transform=train_transform) ) Adding a New Model ~~~~~~~~~~~~~~~~~~ Model construction is handled in ``libauc/trainer/core/image_trainer.py`` → ``Trainer._build_model()``. Add a new ``elif`` branch that instantiates your model and assigns it to ``self.model``: .. code-block:: python # libauc/trainer/core/image_trainer.py def _build_model(self, model_cfg: dict): name = model_cfg.get("name", "").lower() pretrained = model_cfg.get("pretrained", False) pretrained_remote = model_cfg.get("pretrained_remote", False) if name == "resnet18": ... elif name == "mymodel": from mypackage import MyModel model = MyModel(model_cfg) else: raise ValueError(f"Unknown model '{name}'.") model = model.cuda() if pretrained: pretrained_path = model_cfg.get("pretrained_path") state_dict = torch.load(pretrained_path, weights_only=False) if "model_state_dict" in state_dict: state_dict = state_dict["model_state_dict"] # Strip the classification head (fc / linear / classifier / head) # so it is re-initialised for the new task. filtered = { k: v for k, v in state_dict.items() if not any(part in k.split(".") for part in _HEAD_PARTS) } model.load_state_dict(filtered, strict=False) if hasattr(model, "fc"): model.fc.reset_parameters() self.model = model Then reference it in your config: .. code-block:: yaml model: name: mymodel .. note:: The model's final layer must output raw logits (no sigmoid / softmax). Pass ``last_activation=None`` when using libauc built-in models, as the loss functions apply their own activation internally. Adding a Custom Trainer ~~~~~~~~~~~~~~~~~~~~~~~ If the built-in dataset/model extension points are not enough — for example you need a custom forward pass, auxiliary losses, mixed-precision training, or a completely different sampling strategy — you can subclass :class:`~libauc.trainer.core.image_trainer.Trainer` and override only the methods you need, exactly as :class:`~libauc.trainer.core.graph_trainer.GraphTrainer` does for graph data. **Overridable hooks** .. list-table:: :widths: 35 65 :header-rows: 1 * - Method - What to override it for * - ``_build_model(self, model_cfg)`` - Construct a custom architecture; set ``self.model`` * - ``_construct_optimizer_and_loss(self, model, train_args)`` - Wire a custom loss or optimizer; return ``(loss_fn, optimizer)`` * - ``_get_train_dataloader(self, train_args)`` - Replace the default ``DualSampler`` / ``DataLoader`` setup * - ``_get_eval_dataloader(self, dataset, train_args)`` - Use a different evaluation loader (e.g. graph batching) * - ``train(self)`` - Fully customise the training loop (mixed-precision, grad clipping, …) HuggingFace Integration ------------------------------------------------------------------------------------ The Trainer has **built-in** support for any HuggingFace image-classification model — no code changes are required. Set ``model.name`` to a HuggingFace repo ID (any string containing ``/``) and the Trainer will: 1. Download the model via ``AutoModelForImageClassification.from_pretrained``. 2. Reinitialize the classifier head to match your task (binary or multi-label). 3. Download the ``AutoImageProcessor`` and apply it inside the DataLoader collate function — your dataset needs no HF-specific transforms. Quick Start — HuggingFace Model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A single line in your YAML is all that is needed: .. code-block:: yaml model: name: google/vit-base-patch16-224 # any HuggingFace repo ID Understanding the Classifier-Head Mismatch Log ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When you load a pretrained checkpoint the Trainer prints a brief load report: .. code:: text classifier.weight | MISMATCH | Reinit — ckpt: torch.Size([1000, 768]) vs model: torch.Size([1, 768]) classifier.bias | MISMATCH | Reinit — ckpt: torch.Size([1000]) vs model: torch.Size([1]) **This is expected and correct.** The upstream checkpoint was trained on ImageNet-1k (1 000 classes); your task uses a single binary output. How Preprocessing Works ~~~~~~~~~~~~~~~~~~~~~~~ For HuggingFace models, an ``AutoImageProcessor`` is downloaded alongside the model weights and applied inside the DataLoader's ``collate_fn``, on CPU in a worker process, *before* the batch reaches the GPU. The processor handles: - **Resize** to the model's expected resolution (e.g. 224 × 224 for ViT-Base, 224 × 224 for CLIP) - **Normalisation** with the model-specific mean / std Because preprocessing happens in the DataLoader, your dataset's ``__getitem__`` only needs a plain ``transforms.ToTensor()`` — no ``transforms.Resize`` or ``transforms.Normalize``. Single-channel (grayscale) images are automatically expanded to 3 channels before the processor is called. .. note:: Images entering the DataLoader should be ``float32`` tensors in the ``[0, 1]`` range — i.e. the output of ``torchvision.transforms.ToTensor()``. Do **not** apply any additional resize or normalise transforms in the dataset; the processor will handle that. Two-Stage AUROC Training with a HuggingFace Model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The same BCE → AUCMLoss workflow works with HF models. Create ``vit_ce.yaml`` (Stage 1 — BCE warm-up): .. code-block:: yaml dataset: name: cifar10 eval_splits: [val, test] model: name: google/vit-base-patch16-224 # replace with any HF repo ID metrics: - AUROC training: experiment_name: vit_bce_cifar10 epochs: 10 batch_size: 32 eval_batch_size: 64 sampling_rate: 0.5 num_workers: 4 decay_epochs: [0.6, 0.9] loss: BCELoss loss_kwargs: {} optimizer: Adam optimizer_kwargs: lr: 2.0e-5 # small LR for fine-tuning a pretrained model weight_decay: 1.0e-4 output_path: ./output save_checkpoint_every: 5 verbose: 1 Create ``vit_auc.yaml`` (Stage 2 — AUROC optimization, fine-tune from Stage 1): .. code-block:: yaml dataset: name: cifar10 eval_splits: [val, test] model: name: google/vit-base-patch16-224 pretrained: true pretrained_path: "./output/vit_bce_cifar10/epoch_10.pt" metrics: - AUROC training: experiment_name: vit_AUCMLoss_cifar10 epochs: 20 batch_size: 32 eval_batch_size: 64 sampling_rate: 0.2 num_workers: 4 decay_epochs: [0.5, 0.75] loss: AUCMLoss loss_kwargs: margin: 1.0 optimizer: PESG optimizer_kwargs: lr: 0.05 epoch_decay: 0.002 weight_decay: 1.0e-5 momentum: 0.9 output_path: ./output save_checkpoint_every: 5 verbose: 1 Run both stages: .. code:: bash python -m libauc.trainer.run_image_trainer --config_file vit_ce.yaml python -m libauc.trainer.run_image_trainer --config_file vit_auc.yaml Supported HuggingFace Models ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Any repo ID accepted by ``AutoModelForImageClassification`` works. A few tested examples: .. list-table:: :widths: 42 58 :header-rows: 1 * - HF repo ID - Notes * - ``google/vit-base-patch16-224`` - ViT-Base (12 layers, 768 hidden) — general purpose * - ``openai/clip-vit-base-patch32`` - CLIP ViT with rich multi-modal features * - ``microsoft/resnet-50`` - HuggingFace-wrapped ResNet-50 * - ``google/efficientnet-b0`` - EfficientNet-B0 from HF Hub * - ``Ahmed9275/Vit-Cifar100`` - ViT fine-tuned on CIFAR-100 — good warm start for CIFAR tasks Expected Outputs ------------------------------------------------------------------------------------ After both stages finish, your output directory will look like: **Built-in model (ResNet-18):** .. code:: text ./output/ ├── resnet18_ce_cifar10/ │ ├── epoch_5.pt │ ├── ... │ └── epoch_100.pt ← loaded by aucmloss_config.yaml as pretrained_path └── resnet18_AUCMLoss_cifar10/ ├── epoch_5.pt ├── ... └── epoch_100.pt **HuggingFace model (ViT-Base):** .. code:: text ./output/ ├── vit_bce_cifar10/ │ ├── epoch_5.pt │ ├── ... │ └── epoch_10.pt ← loaded by vit_auc.yaml as pretrained_path └── vit_AUCMLoss_cifar10/ ├── epoch_5.pt ├── ... └── epoch_20.pt Validation and test AUROC scores are printed after every evaluation epoch.