Optimizing Global Contrastive Loss with Automatic Temperature Individualization (iSogCLR)
================================================================================================================================
.. raw:: html
------------------------------------------------------------------------------------
.. container:: cell markdown
| **Author**: Zi-Hao Qiu
| **Edited by**: Zhuoning Yuan, Tianbao Yang
\
Introduction
------------------------------------------------------------------------------------
In this tutorial, we introduce the application of iSogCLR algorithm in a
typical bimodal contrastive learning task. In pretraining stage, we
sample a subset of the widely used `CC3M
dataset `__,
which contains about 3,000,000 image-text pairs. And then we evaluate
the pretrained models via zero-shot image/text retrieval on
`MS-COCO `__ dataset.
For the convenience of reproduction, we provide a subset of CC3M
`here `__,
which contains 300,000 image text pairs. We also provide the MS-COCO
dataset and its jsons files
`here `__.
The experiment in this tutorial is conducted one 4 Nvidia 3090 GPUs, you
can modify the **CUDA_VISIBLE_DEVICES** option and **batch_size_train**
option based on your equipments.
**References**
If you find this tutorial helpful in your work, please cite our `library paper `__ and the following papers:
.. code-block:: RST
@inproceedings{qiu2023isogclr,
title={Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization},
author={Qiu, Zi-Hao and Hu, Quanqi and Yuan, Zhuoning and Zhou, Denny and Zhang, Lijun and Yang, Tianbao},
booktitle={International Conference on Machine Learning},
year={2023},
organization={PMLR}
}
Install Latest LibAUC and Other Required Libs
------------------------------------------------------------------------------------
.. code:: python
!pip install -U libauc
Here we use `timm
library `__ to
build image encoder and use the `transformers
library `__ to build text
encoder.
.. code:: python
!pip install timm
!pip install transformers
We compare our iSogCLR with CLIP, which is implemented by
`OpenCLIP `__
.. code:: python
!pip install open_clip_torch
Import required libs
------------------------------------------------------------------------------------
.. code:: python
import os
os.environ["TOKENIZERS_PARALLELISM"] = "true"
os.environ["CUDA_VISIBLE_DEVICES"] = '0' # distributed training: '0,1,2,3'
import re
import argparse
from pathlib import Path
import json
import os
import random
import math
from functools import partial
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.backends.cudnn as cudnn
from torch import optim
import torchvision
from torchvision import transforms
from torch.utils.data import Dataset, Subset, DataLoader
from PIL import Image
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
Image.MAX_IMAGE_PIXELS = None
import cv2
import numpy as np
import timm
from transformers import AutoModel, AutoTokenizer
import open_clip
from open_clip.loss import ClipLoss
import libauc
from libauc.losses.contrastive import GCLoss_v2
from libauc.optimizer import iSogCLR
from libauc.utils import CosineLRScheduler
Arguments for experiments
------------------------------------------------------------------------------------
.. code:: python
# path to data folder
data_path = 'cc3m_subset'
train_file = 'cc3m_subset.json'
# model config
image_encoder = 'resnet50'
text_encoder = 'distilbert-base-uncased'
image_res = 256
vision_width = 768
embed_dim = 256
seed = 42
# optimizer and schedular
opt = 'adamW'
lr = 3e-4
min_lr = 1e-5
warmup = True
warmup_lr = 1e-5
weight_decay = 0.02
decay_rate = 1
epochs = 30
warmup_epochs = 20
cooldown_epochs = 0
# training & test settings
batch_size_train = 256
batch_size_test = 512
k_test = 256
# output path
output_dir = './output/'
# AMP training
use_amp = True
# loss config
temp = 0.01 # the temperature parameter for clip or sogclr
gamma = 0.8 # the parameter for the moving average estimator in sogclr/isogclr
rho = 8.0 # the rho parameter for isogclr
eta = 1e-4 # learning rate for the learnable temperature variables in isogclr
tau_init = 0.01 # the initial value of the learnable temperature variables in isogclr
beta_u = 0.9 # the momentum parameter for the graidents of the learnable temperature variables
n_gpus = torch.cuda.device_count()
val_coco_file = 'coco_val_new.json'
test_coco_file = 'coco_test_new.json'
coco_image_root = 'coco'
Path(output_dir).mkdir(parents=True, exist_ok=True)
Define helper functions
------------------------------------------------------------------------------------
.. code:: python
# we employ this function to preprocess the captions
def pre_caption(caption, max_words):
caption = re.sub(
r"([,.'!?\"()*#:;~])",
'',
caption.lower(),
).replace('-', ' ').replace('/', ' ').replace('', 'person')
caption = re.sub(
r"\s{2,}",
' ',
caption,
)
caption = caption.rstrip('\n')
caption = caption.strip(' ')
#truncate caption
caption_words = caption.split(' ')
if len(caption_words)>max_words:
caption = ' '.join(caption_words[:max_words])
return caption
.. code:: python
class train_set(Dataset):
def __init__(self, ann_file, transform, image_root, max_words=30):
self.ann = []
for f in ann_file:
self.ann += json.load(open(f,'r'))
self.transform = transform
self.image_root = image_root
self.max_words = max_words
self.img_ids = {}
n = 0
for ann in self.ann:
img_id = ann['image_id']
if img_id not in self.img_ids.keys():
self.img_ids[img_id] = n
n += 1
def __len__(self):
return len(self.ann)
def __getitem__(self, index):
ann = self.ann[index]
image_path = os.path.join(self.image_root, ann['image'])
image = Image.open(image_path).convert('RGB')
image = self.transform(image)
caption = pre_caption(ann['caption'], self.max_words)
return image, caption, self.img_ids[ann['image_id']], index
class eval_set(Dataset):
def __init__(self, ann_file, transform, image_root, max_words=30):
self.ann = json.load(open(ann_file,'r'))
self.transform = transform
self.image_root = image_root
self.max_words = max_words
self.text = []
self.image = []
self.txt2img = {}
self.img2txt = {}
txt_id = 0
for img_id, ann in enumerate(self.ann):
self.image.append(ann['image'])
self.img2txt[img_id] = []
for i, caption in enumerate(ann['caption']):
self.text.append(pre_caption(caption,self.max_words))
self.img2txt[img_id].append(txt_id)
self.txt2img[txt_id] = img_id
txt_id += 1
def __len__(self):
return len(self.image)
def __getitem__(self, index):
image_path = os.path.join(self.image_root, self.ann[index]['image'])
image = Image.open(image_path).convert('RGB')
image = self.transform(image)
return image, index
.. code:: python
def add_weight_decay(model, weight_decay=1e-5, skip_list=()):
decay = []
no_decay = []
for name, param in model.named_parameters():
if not param.requires_grad:
continue # frozen weights
if len(param.shape) == 1 or name.endswith(".bias") or name in skip_list:
no_decay.append(param)
else:
decay.append(param)
return [
{'params': no_decay, 'weight_decay': 0.},
{'params': decay, 'weight_decay': weight_decay}]
def create_optimizer(model, opt, weight_decay=1e-5, filter_bias_and_bn=True):
if weight_decay and filter_bias_and_bn:
skip = {}
if hasattr(model, 'no_weight_decay'):
skip = model.no_weight_decay()
parameters = add_weight_decay(model, weight_decay, skip)
weight_decay = 0.
else:
parameters = model.parameters()
opt_args = dict(lr=lr, weight_decay=weight_decay)
optimizer = iSogCLR(parameters, mode=opt, **opt_args)
return optimizer
.. code:: python
def create_scheduler(optimizer):
num_epochs = epochs
lr_scheduler = CosineLRScheduler(
optimizer,
t_initial = num_epochs,
t_mul = 1.0,
lr_min = min_lr,
decay_rate = decay_rate,
warmup_lr_init = warmup_lr,
warmup_t = warmup_epochs,
cycle_limit = 1,
t_in_epochs = True,
noise_range_t = None,
noise_pct = 0.67,
noise_std = 1.0,
noise_seed = 42,
)
return lr_scheduler
Fix random seed
------------------------------------------------------------------------------------
The following functions limit the number of sources of randomness
behaviors, such as model intialization, data shuffling, etcs.
.. code:: python
# fix the seed for reproducibility
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)
cudnn.benchmark = True
Objectives
------------------------------------------------------------------------------------
Here, we mainly introduce the Robust Global Contrastive Loss (RGCL) for
learning representations for bimodal data (e.g., image-text data). For the detailed formulation, please refer to the `paper `__,.
Define the model
------------------------------------------------------------------------------------
.. code:: python
# The following class includes the image encoder, text encoder and several objectives
class Model(nn.Module):
def __init__(self, image_encoder = None, text_encoder = None,
embed_dim = 256, init_model = True, bsz = 128,
loss_type = 'clip', # objective type: clip, sogclr, isogclr
gamma = 0.9, # the coefficient for moving average estimator
temp = 0.01, # temperature for clip or sogclr
rho = 8.0, eta = 0.01, tau_init = 0.01, beta_u = 0.9, # params for isogclr
use_temp_net = True): # True if you want to use temperature network for isogclr
super().__init__()
self.temp = temp
self.visual_encoder = timm.create_model(image_encoder, pretrained=init_model)
self.visual_encoder.reset_classifier(0)
self.text_encoder = AutoModel.from_pretrained(text_encoder, local_files_only=False)
if not init_model:
self.text_encoder.init_weights()
self.vision_proj = nn.Linear(self.visual_encoder.num_features, embed_dim)
self.text_proj = nn.Linear(768, embed_dim)
self.loss_type = loss_type
if self.loss_type == 'clip':
self.criterion = ClipLoss() # here we employ the implementation from open-clip
self.logit_scale = nn.Parameter(torch.ones([]) * np.log(1 / temp))
elif self.loss_type == 'isogclr':
self.criterion = GCLoss_v2(tau=temp, gamma=gamma, tau_min=0.005, tau_max=0.07,
rho=rho, eta=eta, enable_isogclr=True)
else:
raise NotImplementedError
def forward(self, image, text_ids, text_att_masks, idx, text_idx, epoch):
image_embeds = self.visual_encoder(image)
image_embeds = self.vision_proj(image_embeds)
image_feat = F.normalize(image_embeds, dim=-1)
text_output = self.text_encoder(text_ids, attention_mask=text_att_masks, output_hidden_states=False)
text_embeds = self.text_proj(text_output.last_hidden_state[:,0,:])
text_feat = F.normalize(text_embeds, dim=-1)
if self.loss_type == 'clip':
loss = self.criterion(image_feat, text_feat, self.logit_scale.exp())
info = None
elif self.loss_type == 'isogclr':
loss, info = self.criterion(image_feat, text_feat, idx)
return loss, info
Training function
------------------------------------------------------------------------------------
.. code:: python
def epoch_train(model, data_loader, optimizer, tokenizer, epoch, max_epoch, warmup_steps, device, scheduler, grad_scaler):
# train
model.train()
print_freq = 50
step_size = 100
warmup_iterations = warmup_steps * step_size
for i,(image, text, idx, text_idx) in enumerate(data_loader):
optimizer.zero_grad()
image = image.to(device, non_blocking=True)
idx = idx.to(device, non_blocking=True)
text_idx = text_idx.to(device, non_blocking=True)
text_input = tokenizer(text, padding='max_length', truncation=True, max_length=30, return_tensors="pt").to(device)
if grad_scaler is None:
loss, info = model(image, text_input.input_ids, text_input.attention_mask, idx=idx, text_idx=text_idx, epoch=epoch)
loss.mean().backward()
optimizer.step()
else:
with torch.cuda.amp.autocast():
loss, info = model(image, text_input.input_ids, text_input.attention_mask, idx=idx, text_idx=text_idx, epoch=epoch)
grad_scaler.scale(loss.mean()).backward()
grad_scaler.step(optimizer)
grad_scaler.update()
if epoch==0 and i%step_size==0 and i<=warmup_iterations:
scheduler.step(i//step_size)
if i%print_freq == 0:
lr = optimizer.param_groups[0]["lr"]
print("Epoch:", epoch, "iteration:", i, "lr:", lr, "loss:", loss.mean().item())
if info is not None:
print("tau_img: %.4f, tau_txt: %.4f" % (info[0].mean(), info[1].mean()))
Evaluation function
-------------------
.. code:: python
@torch.no_grad()
def evaluation(model, data_loader, tokenizer, device):
# test
model.eval()
print('Computing features for evaluation...')
texts = data_loader.dataset.text
num_text = len(texts)
text_bs = 256
text_embeds = []
for i in range(0, num_text, text_bs):
text = texts[i: min(num_text, i+text_bs)]
text_input = tokenizer(text, padding='max_length', truncation=True, max_length=30, return_tensors="pt").to(device)
text_output = model.text_encoder(text_input.input_ids, attention_mask=text_input.attention_mask, output_hidden_states=False)
text_embed = F.normalize(model.text_proj(text_output.last_hidden_state[:,0,:]), dim=-1)
text_embeds.append(text_embed)
text_embeds = torch.cat(text_embeds,dim=0)
image_embeds = []
for image, img_id in data_loader:
image = image.to(device)
image_feat = model.visual_encoder(image)
image_embed = model.vision_proj(image_feat)
image_embed = F.normalize(image_embed, dim=-1)
image_embeds.append(image_embed)
image_embeds = torch.cat(image_embeds,dim=0)
sims_matrix = image_embeds @ text_embeds.t()
score_matrix_i2t = torch.full((len(data_loader.dataset.image),len(texts)),-100.0).to(device)
for i,sims in enumerate(sims_matrix):
topk_sim, topk_idx = sims.topk(k=k_test, dim=0)
score_matrix_i2t[i, topk_idx] = topk_sim
sims_matrix = sims_matrix.t()
score_matrix_t2i = torch.full((len(texts),len(data_loader.dataset.image)),-100.0).to(device)
for i,sims in enumerate(sims_matrix):
topk_sim, topk_idx = sims.topk(k=k_test, dim=0)
score_matrix_t2i[i, topk_idx] = topk_sim
return score_matrix_i2t.cpu().numpy(), score_matrix_t2i.cpu().numpy()
@torch.no_grad()
def itm_eval(scores_i2t, scores_t2i, txt2img, img2txt):
#Images->Text
ranks = np.zeros(scores_i2t.shape[0])
for index,score in enumerate(scores_i2t):
inds = np.argsort(score)[::-1]
# Score
rank = 1e20
for i in img2txt[index]:
tmp = np.where(inds == i)[0][0]
if tmp < rank:
rank = tmp
ranks[index] = rank
# Compute metrics
tr1 = 100.0 * len(np.where(ranks < 1)[0]) / len(ranks)
tr5 = 100.0 * len(np.where(ranks < 5)[0]) / len(ranks)
tr10 = 100.0 * len(np.where(ranks < 10)[0]) / len(ranks)
#Text->Images
ranks = np.zeros(scores_t2i.shape[0])
for index,score in enumerate(scores_t2i):
inds = np.argsort(score)[::-1]
ranks[index] = np.where(inds == txt2img[index])[0][0]
# Compute metrics
ir1 = 100.0 * len(np.where(ranks < 1)[0]) / len(ranks)
ir5 = 100.0 * len(np.where(ranks < 5)[0]) / len(ranks)
ir10 = 100.0 * len(np.where(ranks < 10)[0]) / len(ranks)
tr_mean = (tr1 + tr5 + tr10) / 3
ir_mean = (ir1 + ir5 + ir10) / 3
r_mean = (tr_mean + ir_mean) / 2
eval_result = {'txt_r1': tr1,
'txt_r5': tr5,
'txt_r10': tr10,
'txt_r_mean': tr_mean,
'img_r1': ir1,
'img_r5': ir5,
'img_r10': ir10,
'img_r_mean': ir_mean,
'r_mean': r_mean}
return eval_result
Create datasets and dataloaders
------------------------------------------------------------------------------------
.. code:: python
# set up the transformation, datasets and dataloaders
train_transform = transforms.Compose([
transforms.RandomResizedCrop(image_res, scale=(0.5, 1.0), interpolation=Image.BICUBIC),
transforms.RandomHorizontalFlip(),
transforms.RandAugment(),
transforms.ToTensor(),
transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)),
])
test_transform = transforms.Compose([
transforms.Resize((image_res, image_res), interpolation=Image.BICUBIC),
transforms.ToTensor(),
transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)),
])
train_dataset = train_set([train_file], train_transform, data_path)
val_coco_dataset = eval_set(val_coco_file, test_transform, coco_image_root)
test_coco_dataset = eval_set(test_coco_file, test_transform, coco_image_root)
print("len of train_dataset:", len(train_dataset))
print("len of coco val/test:", len(val_coco_dataset), len(test_coco_dataset))
train_loader = DataLoader(train_dataset, batch_size=batch_size_train * n_gpus, num_workers=16, pin_memory=True,
shuffle=True, drop_last=True, prefetch_factor=4)
val_loader = DataLoader(val_coco_dataset, batch_size=batch_size_test, num_workers=16, pin_memory=True,
shuffle=False, drop_last=False, prefetch_factor=12)
test_loader = DataLoader(test_coco_dataset, batch_size=batch_size_test, num_workers=16, pin_memory=True,
shuffle=False, drop_last=False, prefetch_factor=12)
.. parsed-literal::
len of train_dataset: 300000
len of coco val/test: 5000 5000
Launch training and evaluation for CLIP
---------------------------------------
.. code:: python
# create the model
tokenizer = AutoTokenizer.from_pretrained(text_encoder, local_files_only=False)
model = Model(image_encoder=image_encoder, text_encoder=text_encoder, embed_dim=embed_dim,
init_model=True, bsz=batch_size_train, loss_type='clip',
gamma=gamma, temp=temp, rho=rho, eta=eta, tau_init=tau_init, beta_u=beta_u)
model = model.cuda()
.. code:: python
if n_gpus > 1:
print("Using", n_gpus, "GPUs")
model = nn.DataParallel(model)
.. code:: python
# set up the optimizer and objective function
optimizer = create_optimizer(model, opt, weight_decay)
lr_scheduler = create_scheduler(optimizer)
if use_amp:
grad_scaler = torch.cuda.amp.GradScaler()
else:
grad_scaler = None
# training loop
for epoch in range(0, epochs):
train_stats = epoch_train(model, train_loader, optimizer, tokenizer, epoch, epochs,
warmup_epochs, torch.device('cuda'), lr_scheduler, grad_scaler)
# evaluate the model on ms-coco data
try:
# for distributed training
score_val_i2t_coco, score_val_t2i_coco = evaluation(model.module, val_loader, tokenizer, torch.device('cuda')) # model.module
score_test_i2t_coco, score_test_t2i_coco = evaluation(model.module, test_loader, tokenizer, torch.device('cuda'))
except:
# for non-distributed training
score_val_i2t_coco, score_val_t2i_coco = evaluation(model, val_loader, tokenizer, torch.device('cuda')) # model.module
score_test_i2t_coco, score_test_t2i_coco = evaluation(model, test_loader, tokenizer, torch.device('cuda'))
print("Epoch:", epoch)
val_result_coco = itm_eval(score_val_i2t_coco, score_val_t2i_coco, val_loader.dataset.txt2img, val_loader.dataset.img2txt)
print("coco val:", val_result_coco)
test_result_coco = itm_eval(score_test_i2t_coco, score_test_t2i_coco, test_loader.dataset.txt2img, test_loader.dataset.img2txt)
print("coco test:", test_result_coco)
lr_scheduler.step(epoch+warmup_epochs+1)
.. parsed-literal::
Epoch: 0 iteration: 0 lr: 1e-05 loss: 11.74642562866211
Epoch: 0 iteration: 50 lr: 1e-05 loss: 7.507866859436035
Epoch: 0 iteration: 100 lr: 2.45e-05 loss: 5.759531497955322
Epoch: 0 iteration: 150 lr: 2.45e-05 loss: 4.457749843597412
Epoch: 0 iteration: 200 lr: 3.899999999999999e-05 loss: 3.844197988510132
Epoch: 0 iteration: 250 lr: 3.899999999999999e-05 loss: 3.469355583190918
Computing features for evaluation...
Computing features for evaluation...
Epoch: 0
coco val: {'txt_r1': 3.86, 'txt_r5': 12.8, 'txt_r10': 19.8, 'txt_r_mean': 12.153333333333334, 'img_r1': 1.8872451019592162, 'img_r5': 7.0171931227509, 'img_r10': 12.043182726909237, 'img_r_mean': 6.982540317206451, 'r_mean': 9.567936825269893}
coco test: {'txt_r1': 3.6, 'txt_r5': 12.38, 'txt_r10': 18.84, 'txt_r_mean': 11.606666666666667, 'img_r1': 1.8032786885245902, 'img_r5': 7.005197920831668, 'img_r10': 11.943222710915634, 'img_r_mean': 6.917233106757298, 'r_mean': 9.261949886711982}
Epoch: 1 iteration: 0 lr: 0.0002992056748283996 loss: 3.105051279067993
Epoch: 1 iteration: 50 lr: 0.0002992056748283996 loss: 2.4103074073791504
Epoch: 1 iteration: 100 lr: 0.0002992056748283996 loss: 2.2818379402160645
Epoch: 1 iteration: 150 lr: 0.0002992056748283996 loss: 2.118741989135742
Epoch: 1 iteration: 200 lr: 0.0002992056748283996 loss: 1.9152384996414185
Epoch: 1 iteration: 250 lr: 0.0002992056748283996 loss: 1.8800408840179443
Computing features for evaluation...
Computing features for evaluation...
Epoch: 1
coco val: {'txt_r1': 15.04, 'txt_r5': 33.92, 'txt_r10': 45.58, 'txt_r_mean': 31.513333333333332, 'img_r1': 8.06077568972411, 'img_r5': 22.718912435025988, 'img_r10': 33.10275889644142, 'img_r_mean': 21.29414900706384, 'r_mean': 26.403741170198586}
coco test: {'txt_r1': 14.64, 'txt_r5': 34.1, 'txt_r10': 45.68, 'txt_r_mean': 31.473333333333333, 'img_r1': 7.804878048780488, 'img_r5': 22.82686925229908, 'img_r10': 33.88644542183127, 'img_r_mean': 21.50606424097028, 'r_mean': 26.489698787151806}
Epoch: 2 iteration: 0 lr: 0.0002968314021064018 loss: 1.5531284809112549
Epoch: 2 iteration: 50 lr: 0.0002968314021064018 loss: 1.5267637968063354
Epoch: 2 iteration: 100 lr: 0.0002968314021064018 loss: 1.4859260320663452
Epoch: 2 iteration: 150 lr: 0.0002968314021064018 loss: 1.552567958831787
Epoch: 2 iteration: 200 lr: 0.0002968314021064018 loss: 1.4763367176055908
Epoch: 2 iteration: 250 lr: 0.0002968314021064018 loss: 1.501932978630066
Computing features for evaluation...
Computing features for evaluation...
Epoch: 2
coco val: {'txt_r1': 15.7, 'txt_r5': 36.7, 'txt_r10': 48.82, 'txt_r_mean': 33.74, 'img_r1': 10.163934426229508, 'img_r5': 26.62934826069572, 'img_r10': 37.99280287884846, 'img_r_mean': 24.92869518859123, 'r_mean': 29.334347594295615}
coco test: {'txt_r1': 15.74, 'txt_r5': 36.68, 'txt_r10': 48.24, 'txt_r_mean': 33.553333333333335, 'img_r1': 9.75609756097561, 'img_r5': 26.965213914434226, 'img_r10': 38.74050379848061, 'img_r_mean': 25.153938424630145, 'r_mean': 29.35363587898174}
Epoch: 3 iteration: 0 lr: 0.00029290319486279724 loss: 1.2079691886901855
Epoch: 3 iteration: 50 lr: 0.00029290319486279724 loss: 1.2061635255813599
Epoch: 3 iteration: 100 lr: 0.00029290319486279724 loss: 1.181814432144165
Epoch: 3 iteration: 150 lr: 0.00029290319486279724 loss: 1.235809564590454
Epoch: 3 iteration: 200 lr: 0.00029290319486279724 loss: 1.2041468620300293
Epoch: 3 iteration: 250 lr: 0.00029290319486279724 loss: 1.2037649154663086
Computing features for evaluation...
Computing features for evaluation...
Epoch: 3
coco val: {'txt_r1': 16.04, 'txt_r5': 36.82, 'txt_r10': 49.36, 'txt_r_mean': 34.07333333333333, 'img_r1': 10.44782087165134, 'img_r5': 27.457017193122752, 'img_r10': 38.48460615753699, 'img_r_mean': 25.463148074103696, 'r_mean': 29.76824070371851}
coco test: {'txt_r1': 15.7, 'txt_r5': 36.5, 'txt_r10': 48.58, 'txt_r_mean': 33.593333333333334, 'img_r1': 10.343862455017993, 'img_r5': 27.988804478208717, 'img_r10': 39.36825269892043, 'img_r_mean': 25.900306544049045, 'r_mean': 29.74681993869119}
Epoch: 4 iteration: 0 lr: 0.00028746409135817707 loss: 1.0229318141937256
Epoch: 4 iteration: 50 lr: 0.00028746409135817707 loss: 0.8746964931488037
Epoch: 4 iteration: 100 lr: 0.00028746409135817707 loss: 1.064015507698059
Epoch: 4 iteration: 150 lr: 0.00028746409135817707 loss: 1.0923449993133545
Epoch: 4 iteration: 200 lr: 0.00028746409135817707 loss: 0.979778528213501
Epoch: 4 iteration: 250 lr: 0.00028746409135817707 loss: 1.0288567543029785
Computing features for evaluation...
Computing features for evaluation...
Epoch: 4
coco val: {'txt_r1': 16.2, 'txt_r5': 37.3, 'txt_r10': 49.6, 'txt_r_mean': 34.36666666666667, 'img_r1': 11.407437025189925, 'img_r5': 28.64454218312675, 'img_r10': 40.199920031987205, 'img_r_mean': 26.750633080101295, 'r_mean': 30.55864987338398}
coco test: {'txt_r1': 15.46, 'txt_r5': 37.26, 'txt_r10': 48.44, 'txt_r_mean': 33.72, 'img_r1': 10.71171531387445, 'img_r5': 28.95641743302679, 'img_r10': 40.863654538184726, 'img_r_mean': 26.843929095028653, 'r_mean': 30.281964547514328}
Epoch: 5 iteration: 0 lr: 0.0002805736835487436 loss: 0.748623251914978
Epoch: 5 iteration: 50 lr: 0.0002805736835487436 loss: 0.8048175573348999
Epoch: 5 iteration: 100 lr: 0.0002805736835487436 loss: 0.8324432969093323
Epoch: 5 iteration: 150 lr: 0.0002805736835487436 loss: 0.8187351822853088
Epoch: 5 iteration: 200 lr: 0.0002805736835487436 loss: 0.8561583757400513
Epoch: 5 iteration: 250 lr: 0.0002805736835487436 loss: 0.7616273164749146
Computing features for evaluation...
Computing features for evaluation...
Epoch: 5
coco val: {'txt_r1': 15.6, 'txt_r5': 37.12, 'txt_r10': 49.98, 'txt_r_mean': 34.23333333333333, 'img_r1': 11.463414634146341, 'img_r5': 29.52019192323071, 'img_r10': 41.35545781687325, 'img_r_mean': 27.446354791416766, 'r_mean': 30.839844062375047}
coco test: {'txt_r1': 14.72, 'txt_r5': 35.68, 'txt_r10': 48.34, 'txt_r_mean': 32.913333333333334, 'img_r1': 11.523390643742504, 'img_r5': 30.143942423030786, 'img_r10': 41.67932826869252, 'img_r_mean': 27.78222044515527, 'r_mean': 30.3477768892443}
Epoch: 6 iteration: 0 lr: 0.0002723074641843674 loss: 0.5856387615203857
Epoch: 6 iteration: 50 lr: 0.0002723074641843674 loss: 0.7076289057731628
Epoch: 6 iteration: 100 lr: 0.0002723074641843674 loss: 0.6565060615539551
Epoch: 6 iteration: 150 lr: 0.0002723074641843674 loss: 0.6765242218971252
Epoch: 6 iteration: 200 lr: 0.0002723074641843674 loss: 0.7100015878677368
Epoch: 6 iteration: 250 lr: 0.0002723074641843674 loss: 0.6650581955909729
Computing features for evaluation...
Computing features for evaluation...
Epoch: 6
coco val: {'txt_r1': 15.54, 'txt_r5': 37.94, 'txt_r10': 50.16, 'txt_r_mean': 34.54666666666666, 'img_r1': 11.243502598960417, 'img_r5': 29.432227109156337, 'img_r10': 41.04758096761295, 'img_r_mean': 27.24110355857657, 'r_mean': 30.893885112621614}
coco test: {'txt_r1': 15.78, 'txt_r5': 36.78, 'txt_r10': 49.24, 'txt_r_mean': 33.93333333333334, 'img_r1': 11.379448220711716, 'img_r5': 29.956017592962816, 'img_r10': 41.45541783286685, 'img_r_mean': 27.59696121551379, 'r_mean': 30.765147274423562}
Epoch: 7 iteration: 0 lr: 0.00026275599969422214 loss: 0.5822378396987915
Epoch: 7 iteration: 50 lr: 0.00026275599969422214 loss: 0.5452847480773926
Epoch: 7 iteration: 100 lr: 0.00026275599969422214 loss: 0.5890320539474487
Epoch: 7 iteration: 150 lr: 0.00026275599969422214 loss: 0.558639645576477
Epoch: 7 iteration: 200 lr: 0.00026275599969422214 loss: 0.6335784196853638
Epoch: 7 iteration: 250 lr: 0.00026275599969422214 loss: 0.6401098370552063
Computing features for evaluation...
Computing features for evaluation...
Epoch: 7
coco val: {'txt_r1': 16.54, 'txt_r5': 38.52, 'txt_r10': 51.04, 'txt_r_mean': 35.36666666666667, 'img_r1': 11.795281887245102, 'img_r5': 29.89204318272691, 'img_r10': 41.58736505397841, 'img_r_mean': 27.758230041316807, 'r_mean': 31.562448353991737}
coco test: {'txt_r1': 16.28, 'txt_r5': 37.32, 'txt_r10': 49.64, 'txt_r_mean': 34.413333333333334, 'img_r1': 11.47141143542583, 'img_r5': 30.275889644142342, 'img_r10': 42.147141143542584, 'img_r_mean': 27.964814074370253, 'r_mean': 31.189073703851793}
Epoch: 8 iteration: 0 lr: 0.0002520239379220344 loss: 0.5210278034210205
Epoch: 8 iteration: 50 lr: 0.0002520239379220344 loss: 0.4082544445991516
Epoch: 8 iteration: 100 lr: 0.0002520239379220344 loss: 0.4823477864265442
Epoch: 8 iteration: 150 lr: 0.0002520239379220344 loss: 0.49092692136764526
Epoch: 8 iteration: 200 lr: 0.0002520239379220344 loss: 0.5032364130020142
Epoch: 8 iteration: 250 lr: 0.0002520239379220344 loss: 0.4627079963684082
Computing features for evaluation...
Computing features for evaluation...
Epoch: 8
coco val: {'txt_r1': 15.3, 'txt_r5': 36.62, 'txt_r10': 49.02, 'txt_r_mean': 33.64666666666667, 'img_r1': 11.615353858456617, 'img_r5': 29.78808476609356, 'img_r10': 41.175529788084766, 'img_r_mean': 27.526322804211645, 'r_mean': 30.586494735439157}
coco test: {'txt_r1': 14.7, 'txt_r5': 34.02, 'txt_r10': 47.42, 'txt_r_mean': 32.04666666666667, 'img_r1': 11.411435425829668, 'img_r5': 29.76809276289484, 'img_r10': 41.44342263094762, 'img_r_mean': 27.540983606557376, 'r_mean': 29.79382513661202}
Epoch: 9 iteration: 0 lr: 0.00024022886158240857 loss: 0.34958702325820923
Epoch: 9 iteration: 50 lr: 0.00024022886158240857 loss: 0.4485335350036621
Epoch: 9 iteration: 100 lr: 0.00024022886158240857 loss: 0.41256430745124817
Epoch: 9 iteration: 150 lr: 0.00024022886158240857 loss: 0.3847663998603821
Epoch: 9 iteration: 200 lr: 0.00024022886158240857 loss: 0.434209942817688
Epoch: 9 iteration: 250 lr: 0.00024022886158240857 loss: 0.4179908037185669
Computing features for evaluation...
Computing features for evaluation...
Epoch: 9
coco val: {'txt_r1': 15.56, 'txt_r5': 37.96, 'txt_r10': 50.06, 'txt_r_mean': 34.52666666666667, 'img_r1': 11.611355457816874, 'img_r5': 29.848060775689724, 'img_r10': 41.5953618552579, 'img_r_mean': 27.684926029588166, 'r_mean': 31.10579634812742}
coco test: {'txt_r1': 15.74, 'txt_r5': 36.64, 'txt_r10': 48.5, 'txt_r_mean': 33.626666666666665, 'img_r1': 11.80327868852459, 'img_r5': 29.772091163534586, 'img_r10': 41.583366653338665, 'img_r_mean': 27.719578835132612, 'r_mean': 30.673122750899637}
Epoch: 10 iteration: 0 lr: 0.00022749999999999997 loss: 0.33992326259613037
Epoch: 10 iteration: 50 lr: 0.00022749999999999997 loss: 0.3966507911682129
Epoch: 10 iteration: 100 lr: 0.00022749999999999997 loss: 0.3801310360431671
Epoch: 10 iteration: 150 lr: 0.00022749999999999997 loss: 0.342434823513031
Epoch: 10 iteration: 200 lr: 0.00022749999999999997 loss: 0.3833215832710266
Epoch: 10 iteration: 250 lr: 0.00022749999999999997 loss: 0.43105077743530273
Computing features for evaluation...
Computing features for evaluation...
Epoch: 10
coco val: {'txt_r1': 16.44, 'txt_r5': 39.08, 'txt_r10': 51.28, 'txt_r_mean': 35.6, 'img_r1': 11.915233906437425, 'img_r5': 30.091963214714113, 'img_r10': 41.84326269492203, 'img_r_mean': 27.950153272024522, 'r_mean': 31.775076636012262}
coco test: {'txt_r1': 15.32, 'txt_r5': 37.7, 'txt_r10': 50.18, 'txt_r_mean': 34.4, 'img_r1': 11.859256297481007, 'img_r5': 30.403838464614154, 'img_r10': 41.911235505797684, 'img_r_mean': 28.058110089297617, 'r_mean': 31.22905504464881}
Epoch: 11 iteration: 0 lr: 0.00021397681324599103 loss: 0.31117188930511475
Epoch: 11 iteration: 50 lr: 0.00021397681324599103 loss: 0.33558982610702515
Epoch: 11 iteration: 100 lr: 0.00021397681324599103 loss: 0.36867523193359375
Epoch: 11 iteration: 150 lr: 0.00021397681324599103 loss: 0.28263527154922485
Epoch: 11 iteration: 200 lr: 0.00021397681324599103 loss: 0.3501768112182617
Epoch: 11 iteration: 250 lr: 0.00021397681324599103 loss: 0.36479008197784424
Computing features for evaluation...
Computing features for evaluation...
Epoch: 11
coco val: {'txt_r1': 15.28, 'txt_r5': 37.84, 'txt_r10': 49.2, 'txt_r_mean': 34.10666666666667, 'img_r1': 11.979208316673331, 'img_r5': 30.23190723710516, 'img_r10': 42.05517792882847, 'img_r_mean': 28.08876449420232, 'r_mean': 31.097715580434496}
coco test: {'txt_r1': 15.22, 'txt_r5': 36.18, 'txt_r10': 48.02, 'txt_r_mean': 33.14, 'img_r1': 11.955217912834867, 'img_r5': 30.979608156737307, 'img_r10': 42.60295881647341, 'img_r_mean': 28.512594962015196, 'r_mean': 30.8262974810076}
Epoch: 12 iteration: 0 lr: 0.00019980746418436736 loss: 0.27429062128067017
Epoch: 12 iteration: 50 lr: 0.00019980746418436736 loss: 0.3097416162490845
Epoch: 12 iteration: 100 lr: 0.00019980746418436736 loss: 0.30445027351379395
Epoch: 12 iteration: 150 lr: 0.00019980746418436736 loss: 0.3258894681930542
Epoch: 12 iteration: 200 lr: 0.00019980746418436736 loss: 0.27619031071662903
Epoch: 12 iteration: 250 lr: 0.00019980746418436736 loss: 0.30364763736724854
Computing features for evaluation...
Computing features for evaluation...
Epoch: 12
coco val: {'txt_r1': 16.56, 'txt_r5': 38.14, 'txt_r10': 50.62, 'txt_r_mean': 35.10666666666666, 'img_r1': 12.35905637744902, 'img_r5': 31.139544182327068, 'img_r10': 42.51899240303879, 'img_r_mean': 28.672530987604958, 'r_mean': 31.88959882713581}
coco test: {'txt_r1': 15.18, 'txt_r5': 36.3, 'txt_r10': 49.56, 'txt_r_mean': 33.68, 'img_r1': 12.295081967213115, 'img_r5': 31.211515393842465, 'img_r10': 42.998800479808075, 'img_r_mean': 28.835132613621216, 'r_mean': 31.25756630681061}
Epoch: 13 iteration: 0 lr: 0.00018514719516857505 loss: 0.2100810557603836
Epoch: 13 iteration: 50 lr: 0.00018514719516857505 loss: 0.2885628938674927
Epoch: 13 iteration: 100 lr: 0.00018514719516857505 loss: 0.2615102529525757
Epoch: 13 iteration: 150 lr: 0.00018514719516857505 loss: 0.30048686265945435
Epoch: 13 iteration: 200 lr: 0.00018514719516857505 loss: 0.30662938952445984
Epoch: 13 iteration: 250 lr: 0.00018514719516857505 loss: 0.3095318377017975
Computing features for evaluation...
Computing features for evaluation...
Epoch: 13
coco val: {'txt_r1': 16.22, 'txt_r5': 37.66, 'txt_r10': 50.22, 'txt_r_mean': 34.699999999999996, 'img_r1': 11.511395441823272, 'img_r5': 29.488204718112755, 'img_r10': 40.95961615353858, 'img_r_mean': 27.3197387711582, 'r_mean': 31.009869385579098}
coco test: {'txt_r1': 16.1, 'txt_r5': 37.5, 'txt_r10': 49.88, 'txt_r_mean': 34.49333333333333, 'img_r1': 11.923230707716913, 'img_r5': 30.22391043582567, 'img_r10': 41.63934426229508, 'img_r_mean': 27.928828468612554, 'r_mean': 31.211080900972945}
Epoch: 14 iteration: 0 lr: 0.00017015662717380974 loss: 0.22490891814231873
Epoch: 14 iteration: 50 lr: 0.00017015662717380974 loss: 0.24104690551757812
Epoch: 14 iteration: 100 lr: 0.00017015662717380974 loss: 0.27677229046821594
Epoch: 14 iteration: 150 lr: 0.00017015662717380974 loss: 0.25092434883117676
Epoch: 14 iteration: 200 lr: 0.00017015662717380974 loss: 0.23248010873794556
Epoch: 14 iteration: 250 lr: 0.00017015662717380974 loss: 0.2669617235660553
Computing features for evaluation...
Computing features for evaluation...
Epoch: 14
coco val: {'txt_r1': 15.82, 'txt_r5': 36.82, 'txt_r10': 49.34, 'txt_r_mean': 33.99333333333333, 'img_r1': 12.047181127548981, 'img_r5': 30.635745701719312, 'img_r10': 42.6109556177529, 'img_r_mean': 28.431294149007062, 'r_mean': 31.2123137411702}
coco test: {'txt_r1': 14.78, 'txt_r5': 35.76, 'txt_r10': 48.42, 'txt_r_mean': 32.98666666666667, 'img_r1': 12.243102758896441, 'img_r5': 30.695721711315475, 'img_r10': 42.36705317872851, 'img_r_mean': 28.435292549646807, 'r_mean': 30.71097960815674}
Epoch: 15 iteration: 0 lr: 0.000155 loss: 0.1818775236606598
Epoch: 15 iteration: 50 lr: 0.000155 loss: 0.21323110163211823
Epoch: 15 iteration: 100 lr: 0.000155 loss: 0.2310401201248169
Epoch: 15 iteration: 150 lr: 0.000155 loss: 0.2086959332227707
Epoch: 15 iteration: 200 lr: 0.000155 loss: 0.22357095777988434
Epoch: 15 iteration: 250 lr: 0.000155 loss: 0.24121759831905365
Computing features for evaluation...
Computing features for evaluation...
Epoch: 15
coco val: {'txt_r1': 16.8, 'txt_r5': 38.16, 'txt_r10': 51.14, 'txt_r_mean': 35.36666666666667, 'img_r1': 11.935225909636145, 'img_r5': 30.103958416633347, 'img_r10': 42.33106757297081, 'img_r_mean': 28.123417299746766, 'r_mean': 31.74504198320672}
coco test: {'txt_r1': 15.2, 'txt_r5': 37.24, 'txt_r10': 50.12, 'txt_r_mean': 34.18666666666667, 'img_r1': 11.611355457816874, 'img_r5': 30.403838464614154, 'img_r10': 42.12714914034386, 'img_r_mean': 28.047447687591628, 'r_mean': 31.117057177129148}
Epoch: 16 iteration: 0 lr: 0.00013984337282619026 loss: 0.20621338486671448
Epoch: 16 iteration: 50 lr: 0.00013984337282619026 loss: 0.20322853326797485
Epoch: 16 iteration: 100 lr: 0.00013984337282619026 loss: 0.2034672498703003
Epoch: 16 iteration: 150 lr: 0.00013984337282619026 loss: 0.2079382836818695
Epoch: 16 iteration: 200 lr: 0.00013984337282619026 loss: 0.21095183491706848
Epoch: 16 iteration: 250 lr: 0.00013984337282619026 loss: 0.20369692146778107
Computing features for evaluation...
Computing features for evaluation...
Epoch: 16
coco val: {'txt_r1': 16.8, 'txt_r5': 37.92, 'txt_r10': 50.68, 'txt_r_mean': 35.13333333333333, 'img_r1': 12.231107556977209, 'img_r5': 30.69972011195522, 'img_r10': 42.54698120751699, 'img_r_mean': 28.492602958816473, 'r_mean': 31.8129681460749}
coco test: {'txt_r1': 15.04, 'txt_r5': 36.64, 'txt_r10': 49.7, 'txt_r_mean': 33.79333333333333, 'img_r1': 12.29908036785286, 'img_r5': 30.86765293882447, 'img_r10': 42.5109956017593, 'img_r_mean': 28.559242969478873, 'r_mean': 31.1762881514061}
Epoch: 17 iteration: 0 lr: 0.00012485280483142487 loss: 0.16787829995155334
Epoch: 17 iteration: 50 lr: 0.00012485280483142487 loss: 0.16973815858364105
Epoch: 17 iteration: 100 lr: 0.00012485280483142487 loss: 0.17559704184532166
Epoch: 17 iteration: 150 lr: 0.00012485280483142487 loss: 0.19280369579792023
Epoch: 17 iteration: 200 lr: 0.00012485280483142487 loss: 0.18810811638832092
Epoch: 17 iteration: 250 lr: 0.00012485280483142487 loss: 0.1578725427389145
Computing features for evaluation...
Computing features for evaluation...
Epoch: 17
coco val: {'txt_r1': 17.0, 'txt_r5': 39.04, 'txt_r10': 51.44, 'txt_r_mean': 35.82666666666666, 'img_r1': 12.343062774890043, 'img_r5': 30.735705717712914, 'img_r10': 42.03118752499, 'img_r_mean': 28.369985339197655, 'r_mean': 32.098326002932154}
coco test: {'txt_r1': 15.72, 'txt_r5': 36.94, 'txt_r10': 50.26, 'txt_r_mean': 34.306666666666665, 'img_r1': 12.02718912435026, 'img_r5': 30.567772890843663, 'img_r10': 42.139144342263094, 'img_r_mean': 28.24470211915234, 'r_mean': 31.275684392909504}
Epoch: 18 iteration: 0 lr: 0.00011019253581563262 loss: 0.17534607648849487
Epoch: 18 iteration: 50 lr: 0.00011019253581563262 loss: 0.19806219637393951
Epoch: 18 iteration: 100 lr: 0.00011019253581563262 loss: 0.16321659088134766
Epoch: 18 iteration: 150 lr: 0.00011019253581563262 loss: 0.15023337304592133
Epoch: 18 iteration: 200 lr: 0.00011019253581563262 loss: 0.14350810647010803
Epoch: 18 iteration: 250 lr: 0.00011019253581563262 loss: 0.1909620463848114
Computing features for evaluation...
Computing features for evaluation...
Epoch: 18
coco val: {'txt_r1': 15.72, 'txt_r5': 37.14, 'txt_r10': 50.2, 'txt_r_mean': 34.35333333333333, 'img_r1': 12.00719712115154, 'img_r5': 30.091963214714113, 'img_r10': 41.819272291083564, 'img_r_mean': 27.972810875649742, 'r_mean': 31.163072104491537}
coco test: {'txt_r1': 14.98, 'txt_r5': 35.98, 'txt_r10': 48.94, 'txt_r_mean': 33.3, 'img_r1': 11.603358656537385, 'img_r5': 30.307876849260296, 'img_r10': 41.92323070771691, 'img_r_mean': 27.94482207117153, 'r_mean': 30.622411035585763}
Epoch: 19 iteration: 0 lr: 9.602318675400897e-05 loss: 0.17000208795070648
Epoch: 19 iteration: 50 lr: 9.602318675400897e-05 loss: 0.14290763437747955
Epoch: 19 iteration: 100 lr: 9.602318675400897e-05 loss: 0.1349085569381714
Epoch: 19 iteration: 150 lr: 9.602318675400897e-05 loss: 0.15767492353916168
Epoch: 19 iteration: 200 lr: 9.602318675400897e-05 loss: 0.15036305785179138
Epoch: 19 iteration: 250 lr: 9.602318675400897e-05 loss: 0.17334865033626556
Computing features for evaluation...
Computing features for evaluation...
Epoch: 19
coco val: {'txt_r1': 16.64, 'txt_r5': 38.02, 'txt_r10': 50.84, 'txt_r_mean': 35.166666666666664, 'img_r1': 12.287085165933627, 'img_r5': 30.635745701719312, 'img_r10': 42.091163534586165, 'img_r_mean': 28.3379981340797, 'r_mean': 31.752332400373184}
coco test: {'txt_r1': 15.64, 'txt_r5': 37.7, 'txt_r10': 49.62, 'txt_r_mean': 34.32, 'img_r1': 12.059176329468213, 'img_r5': 30.743702518992404, 'img_r10': 42.21111555377849, 'img_r_mean': 28.3379981340797, 'r_mean': 31.328999067039852}
Epoch: 20 iteration: 0 lr: 8.250000000000001e-05 loss: 0.14072063565254211
Epoch: 20 iteration: 50 lr: 8.250000000000001e-05 loss: 0.12933437526226044
Epoch: 20 iteration: 100 lr: 8.250000000000001e-05 loss: 0.20693959295749664
Epoch: 20 iteration: 150 lr: 8.250000000000001e-05 loss: 0.15231087803840637
Epoch: 20 iteration: 200 lr: 8.250000000000001e-05 loss: 0.15985363721847534
Epoch: 20 iteration: 250 lr: 8.250000000000001e-05 loss: 0.14119693636894226
Computing features for evaluation...
Computing features for evaluation...
Epoch: 20
coco val: {'txt_r1': 16.54, 'txt_r5': 39.04, 'txt_r10': 50.56, 'txt_r_mean': 35.38, 'img_r1': 12.774890043982406, 'img_r5': 31.20751699320272, 'img_r10': 43.06277489004398, 'img_r_mean': 29.015060642409704, 'r_mean': 32.19753032120485}
coco test: {'txt_r1': 15.62, 'txt_r5': 38.28, 'txt_r10': 51.2, 'txt_r_mean': 35.03333333333333, 'img_r1': 12.263094762095163, 'img_r5': 31.47141143542583, 'img_r10': 43.0187924830068, 'img_r_mean': 28.9177662268426, 'r_mean': 31.975549780087967}
Epoch: 21 iteration: 0 lr: 6.97711384175914e-05 loss: 0.12423430383205414
Epoch: 21 iteration: 50 lr: 6.97711384175914e-05 loss: 0.13421592116355896
Epoch: 21 iteration: 100 lr: 6.97711384175914e-05 loss: 0.09904897212982178
Epoch: 21 iteration: 150 lr: 6.97711384175914e-05 loss: 0.11255185306072235
Epoch: 21 iteration: 200 lr: 6.97711384175914e-05 loss: 0.14298436045646667
Epoch: 21 iteration: 250 lr: 6.97711384175914e-05 loss: 0.13077646493911743
Computing features for evaluation...
Computing features for evaluation...
Epoch: 21
coco val: {'txt_r1': 17.24, 'txt_r5': 38.74, 'txt_r10': 50.88, 'txt_r_mean': 35.620000000000005, 'img_r1': 12.558976409436225, 'img_r5': 31.331467413034787, 'img_r10': 42.630947620951616, 'img_r_mean': 28.84046381447421, 'r_mean': 32.230231907237105}
coco test: {'txt_r1': 15.84, 'txt_r5': 38.24, 'txt_r10': 51.44, 'txt_r_mean': 35.17333333333333, 'img_r1': 12.526989204318273, 'img_r5': 31.36745301879248, 'img_r10': 42.998800479808075, 'img_r_mean': 28.964414234306275, 'r_mean': 32.068873783819804}
Epoch: 22 iteration: 0 lr: 5.797606207796559e-05 loss: 0.09781420230865479
Epoch: 22 iteration: 50 lr: 5.797606207796559e-05 loss: 0.10436877608299255
Epoch: 22 iteration: 100 lr: 5.797606207796559e-05 loss: 0.09954556077718735
Epoch: 22 iteration: 150 lr: 5.797606207796559e-05 loss: 0.10239797830581665
Epoch: 22 iteration: 200 lr: 5.797606207796559e-05 loss: 0.15317881107330322
Epoch: 22 iteration: 250 lr: 5.797606207796559e-05 loss: 0.13270767033100128
Computing features for evaluation...
Computing features for evaluation...
Epoch: 22
coco val: {'txt_r1': 17.24, 'txt_r5': 38.46, 'txt_r10': 52.06, 'txt_r_mean': 35.92, 'img_r1': 12.782886845261896, 'img_r5': 31.6953218712515, 'img_r10': 43.02678928428629, 'img_r_mean': 29.168332666933225, 'r_mean': 32.54416633346661}
coco test: {'txt_r1': 15.6, 'txt_r5': 39.1, 'txt_r10': 50.8, 'txt_r_mean': 35.166666666666664, 'img_r1': 12.566973210715714, 'img_r5': 31.62734906037585, 'img_r10': 43.278688524590166, 'img_r_mean': 29.157670265227242, 'r_mean': 32.162168465946955}
Epoch: 23 iteration: 0 lr: 4.724400030577786e-05 loss: 0.09777984768152237
Epoch: 23 iteration: 50 lr: 4.724400030577786e-05 loss: 0.12258177995681763
Epoch: 23 iteration: 100 lr: 4.724400030577786e-05 loss: 0.1060154139995575
Epoch: 23 iteration: 150 lr: 4.724400030577786e-05 loss: 0.13091956079006195
Epoch: 23 iteration: 200 lr: 4.724400030577786e-05 loss: 0.10514585673809052
Epoch: 23 iteration: 250 lr: 4.724400030577786e-05 loss: 0.12769201397895813
Computing features for evaluation...
Computing features for evaluation...
Epoch: 23
coco val: {'txt_r1': 16.48, 'txt_r5': 38.66, 'txt_r10': 51.64, 'txt_r_mean': 35.593333333333334, 'img_r1': 12.686925229908036, 'img_r5': 31.591363454618154, 'img_r10': 43.114754098360656, 'img_r_mean': 29.131014260962285, 'r_mean': 32.36217379714781}
coco test: {'txt_r1': 15.62, 'txt_r5': 38.06, 'txt_r10': 51.1, 'txt_r_mean': 34.92666666666667, 'img_r1': 12.538984406237505, 'img_r5': 31.74330267892843, 'img_r10': 43.442622950819676, 'img_r_mean': 29.241636678661866, 'r_mean': 32.08415167266427}
Epoch: 24 iteration: 0 lr: 3.769253581563263e-05 loss: 0.08650655299425125
Epoch: 24 iteration: 50 lr: 3.769253581563263e-05 loss: 0.10609667003154755
Epoch: 24 iteration: 100 lr: 3.769253581563263e-05 loss: 0.10544316470623016
Epoch: 24 iteration: 150 lr: 3.769253581563263e-05 loss: 0.08425739407539368
Epoch: 24 iteration: 200 lr: 3.769253581563263e-05 loss: 0.11596322059631348
Epoch: 24 iteration: 250 lr: 3.769253581563263e-05 loss: 0.12456141412258148
Computing features for evaluation...
Computing features for evaluation...
Epoch: 24
coco val: {'txt_r1': 16.9, 'txt_r5': 39.34, 'txt_r10': 52.24, 'txt_r_mean': 36.160000000000004, 'img_r1': 12.730907636945222, 'img_r5': 31.463414634146343, 'img_r10': 43.2906837265094, 'img_r_mean': 29.161668665866987, 'r_mean': 32.6608343329335}
coco test: {'txt_r1': 16.0, 'txt_r5': 38.64, 'txt_r10': 51.36, 'txt_r_mean': 35.333333333333336, 'img_r1': 12.566973210715714, 'img_r5': 31.815273890443823, 'img_r10': 43.59456217512995, 'img_r_mean': 29.325603092096497, 'r_mean': 32.329468212714914}
Epoch: 25 iteration: 0 lr: 2.9426316451256386e-05 loss: 0.11349457502365112
Epoch: 25 iteration: 50 lr: 2.9426316451256386e-05 loss: 0.08233440667390823
Epoch: 25 iteration: 100 lr: 2.9426316451256386e-05 loss: 0.09436212480068207
Epoch: 25 iteration: 150 lr: 2.9426316451256386e-05 loss: 0.0920330286026001
Epoch: 25 iteration: 200 lr: 2.9426316451256386e-05 loss: 0.08613620698451996
Epoch: 25 iteration: 250 lr: 2.9426316451256386e-05 loss: 0.0929696261882782
Computing features for evaluation...
Computing features for evaluation...
Epoch: 25
coco val: {'txt_r1': 17.26, 'txt_r5': 39.76, 'txt_r10': 52.34, 'txt_r_mean': 36.45333333333333, 'img_r1': 12.69092363054778, 'img_r5': 31.43142742902839, 'img_r10': 43.04278288684526, 'img_r_mean': 29.055044648807144, 'r_mean': 32.75418899107024}
coco test: {'txt_r1': 15.84, 'txt_r5': 39.3, 'txt_r10': 51.54, 'txt_r_mean': 35.56, 'img_r1': 12.670931627349061, 'img_r5': 31.54338264694122, 'img_r10': 43.006797281087564, 'img_r_mean': 29.073703851792615, 'r_mean': 32.31685192589631}
Epoch: 26 iteration: 0 lr: 2.2535908641822855e-05 loss: 0.07479941099882126
Epoch: 26 iteration: 50 lr: 2.2535908641822855e-05 loss: 0.08746127784252167
Epoch: 26 iteration: 100 lr: 2.2535908641822855e-05 loss: 0.10455113649368286
Epoch: 26 iteration: 150 lr: 2.2535908641822855e-05 loss: 0.09784542769193649
Epoch: 26 iteration: 200 lr: 2.2535908641822855e-05 loss: 0.06572966277599335
Epoch: 26 iteration: 250 lr: 2.2535908641822855e-05 loss: 0.09240047633647919
Computing features for evaluation...
Computing features for evaluation...
Epoch: 26
coco val: {'txt_r1': 16.88, 'txt_r5': 39.52, 'txt_r10': 52.32, 'txt_r_mean': 36.24, 'img_r1': 12.794882047181128, 'img_r5': 31.70731707317073, 'img_r10': 43.29468212714914, 'img_r_mean': 29.265627082500334, 'r_mean': 32.75281354125017}
coco test: {'txt_r1': 15.76, 'txt_r5': 38.96, 'txt_r10': 51.74, 'txt_r_mean': 35.48666666666667, 'img_r1': 12.794882047181128, 'img_r5': 31.851259496201518, 'img_r10': 43.63454618152739, 'img_r_mean': 29.426895908303347, 'r_mean': 32.45678128748501}
Epoch: 27 iteration: 0 lr: 1.7096805137202738e-05 loss: 0.07049550861120224
Epoch: 27 iteration: 50 lr: 1.7096805137202738e-05 loss: 0.08527995645999908
Epoch: 27 iteration: 100 lr: 1.7096805137202738e-05 loss: 0.07916025817394257
Epoch: 27 iteration: 150 lr: 1.7096805137202738e-05 loss: 0.0926615446805954
Epoch: 27 iteration: 200 lr: 1.7096805137202738e-05 loss: 0.062070801854133606
Epoch: 27 iteration: 250 lr: 1.7096805137202738e-05 loss: 0.06778311729431152
Computing features for evaluation...
Computing features for evaluation...
Epoch: 27
coco val: {'txt_r1': 16.62, 'txt_r5': 39.08, 'txt_r10': 51.5, 'txt_r_mean': 35.733333333333334, 'img_r1': 12.854858056777289, 'img_r5': 31.679328268692522, 'img_r10': 43.238704518192726, 'img_r_mean': 29.257630281220845, 'r_mean': 32.49548180727709}
coco test: {'txt_r1': 15.62, 'txt_r5': 38.3, 'txt_r10': 51.6, 'txt_r_mean': 35.17333333333333, 'img_r1': 12.870851659336266, 'img_r5': 31.835265893642543, 'img_r10': 43.57856857257097, 'img_r_mean': 29.428228708516595, 'r_mean': 32.300781020924966}
Epoch: 28 iteration: 0 lr: 1.3168597893598175e-05 loss: 0.08952151238918304
Epoch: 28 iteration: 50 lr: 1.3168597893598175e-05 loss: 0.08497560024261475
Epoch: 28 iteration: 100 lr: 1.3168597893598175e-05 loss: 0.09802306443452835
Epoch: 28 iteration: 150 lr: 1.3168597893598175e-05 loss: 0.10137701034545898
Epoch: 28 iteration: 200 lr: 1.3168597893598175e-05 loss: 0.08434905111789703
Epoch: 28 iteration: 250 lr: 1.3168597893598175e-05 loss: 0.07585834711790085
Computing features for evaluation...
Computing features for evaluation...
Epoch: 28
coco val: {'txt_r1': 17.42, 'txt_r5': 39.7, 'txt_r10': 52.34, 'txt_r_mean': 36.48666666666667, 'img_r1': 13.066773290683727, 'img_r5': 31.999200319872052, 'img_r10': 43.47061175529788, 'img_r_mean': 29.51219512195122, 'r_mean': 32.999430894308944}
coco test: {'txt_r1': 16.02, 'txt_r5': 39.16, 'txt_r10': 52.3, 'txt_r_mean': 35.82666666666666, 'img_r1': 12.938824470211916, 'img_r5': 32.11515393842463, 'img_r10': 43.874450219912035, 'img_r_mean': 29.64280954284953, 'r_mean': 32.734738104758094}
Epoch: 29 iteration: 0 lr: 1.0794325171600358e-05 loss: 0.0904349684715271
Epoch: 29 iteration: 50 lr: 1.0794325171600358e-05 loss: 0.0633661150932312
Epoch: 29 iteration: 100 lr: 1.0794325171600358e-05 loss: 0.06782661378383636
Epoch: 29 iteration: 150 lr: 1.0794325171600358e-05 loss: 0.0833449587225914
Epoch: 29 iteration: 200 lr: 1.0794325171600358e-05 loss: 0.09229975193738937
Epoch: 29 iteration: 250 lr: 1.0794325171600358e-05 loss: 0.08226582407951355
Computing features for evaluation...
Computing features for evaluation...
Epoch: 29
coco val: {'txt_r1': 16.82, 'txt_r5': 39.46, 'txt_r10': 52.16, 'txt_r_mean': 36.14666666666667, 'img_r1': 12.922830867652939, 'img_r5': 31.747301079568174, 'img_r10': 43.45061975209916, 'img_r_mean': 29.373583899773422, 'r_mean': 32.76012528322005}
coco test: {'txt_r1': 15.9, 'txt_r5': 38.66, 'txt_r10': 51.54, 'txt_r_mean': 35.36666666666667, 'img_r1': 12.802878848460615, 'img_r5': 31.847261095561777, 'img_r10': 43.514594162335065, 'img_r_mean': 29.388244702119152, 'r_mean': 32.37745568439291}
Launch training and evaluation for iSogCLR
------------------------------------------
.. code:: python
# create the model and wrap it in DDP
tokenizer = AutoTokenizer.from_pretrained(text_encoder, local_files_only=False)
model = Model(image_encoder=image_encoder, text_encoder=text_encoder, embed_dim=embed_dim,
init_model=True, bsz=batch_size_train, loss_type='isogclr',
gamma=gamma, temp=temp, rho=rho, eta=eta, tau_init=tau_init, beta_u=beta_u)
model = model.cuda()
.. code:: python
if n_gpus > 1:
print("Using", n_gpus, "GPUs")
model = nn.DataParallel(model)
.. code:: python
# set up the optimizer and objective function
optimizer = create_optimizer(model, opt, weight_decay)
lr_scheduler = create_scheduler(optimizer)
if use_amp:
grad_scaler = torch.cuda.amp.GradScaler()
else:
grad_scaler = None
# training loop
for epoch in range(0, epochs):
train_stats = epoch_train(model, train_loader, optimizer, tokenizer, epoch, epochs,
warmup_epochs, torch.device('cuda'), lr_scheduler, grad_scaler)
# evaluate the model on ms-coco data
try:
# for distributed training
score_val_i2t_coco, score_val_t2i_coco = evaluation(model.module, val_loader, tokenizer, torch.device('cuda'))
score_test_i2t_coco, score_test_t2i_coco = evaluation(model.module, test_loader, tokenizer, torch.device('cuda'))
except:
# for non-distributed training
score_val_i2t_coco, score_val_t2i_coco = evaluation(model, val_loader, tokenizer, torch.device('cuda'))
score_test_i2t_coco, score_test_t2i_coco = evaluation(model, test_loader, tokenizer, torch.device('cuda'))
print("Epoch:", epoch)
val_result_coco = itm_eval(score_val_i2t_coco, score_val_t2i_coco, val_loader.dataset.txt2img, val_loader.dataset.img2txt)
print("coco val:", val_result_coco)
test_result_coco = itm_eval(score_test_i2t_coco, score_test_t2i_coco, test_loader.dataset.txt2img, test_loader.dataset.img2txt)
print("coco test:", test_result_coco)
lr_scheduler.step(epoch+warmup_epochs+1)
.. parsed-literal::
Epoch: 0 iteration: 0 lr: 1e-05 loss: 24.701007843017578
tau_img: 0.0100, tau_txt: 0.0100
Epoch: 0 iteration: 50 lr: 1e-05 loss: 10.574981689453125
tau_img: 0.0100, tau_txt: 0.0100
Epoch: 0 iteration: 100 lr: 2.45e-05 loss: 4.697925567626953
tau_img: 0.0100, tau_txt: 0.0100
Epoch: 0 iteration: 150 lr: 2.45e-05 loss: 1.9576847553253174
tau_img: 0.0100, tau_txt: 0.0100
Epoch: 0 iteration: 200 lr: 3.899999999999999e-05 loss: 1.0460829734802246
tau_img: 0.0100, tau_txt: 0.0100
Epoch: 0 iteration: 250 lr: 3.899999999999999e-05 loss: 0.5043810606002808
tau_img: 0.0100, tau_txt: 0.0100
Computing features for evaluation...
Computing features for evaluation...
Epoch: 0
coco val: {'txt_r1': 4.1, 'txt_r5': 13.8, 'txt_r10': 21.34, 'txt_r_mean': 13.079999999999998, 'img_r1': 2.0591763294682126, 'img_r5': 7.860855657736905, 'img_r10': 13.13874450219912, 'img_r_mean': 7.686258829801413, 'r_mean': 10.383129414900706}
coco test: {'txt_r1': 4.2, 'txt_r5': 12.7, 'txt_r10': 20.2, 'txt_r_mean': 12.366666666666665, 'img_r1': 1.9832067173130747, 'img_r5': 7.493002798880448, 'img_r10': 12.950819672131148, 'img_r_mean': 7.4756763961082235, 'r_mean': 9.921171531387444}
Epoch: 1 iteration: 0 lr: 0.0002992056748283996 loss: 1.3195196390151978
tau_img: 0.0094, tau_txt: 0.0095
Epoch: 1 iteration: 50 lr: 0.0002992056748283996 loss: 0.075884610414505
tau_img: 0.0094, tau_txt: 0.0095
Epoch: 1 iteration: 100 lr: 0.0002992056748283996 loss: 0.3162369430065155
tau_img: 0.0094, tau_txt: 0.0095
Epoch: 1 iteration: 150 lr: 0.0002992056748283996 loss: 0.1882624328136444
tau_img: 0.0094, tau_txt: 0.0095
Epoch: 1 iteration: 200 lr: 0.0002992056748283996 loss: -0.10296255350112915
tau_img: 0.0094, tau_txt: 0.0095
Epoch: 1 iteration: 250 lr: 0.0002992056748283996 loss: 0.15444990992546082
tau_img: 0.0094, tau_txt: 0.0095
Computing features for evaluation...
Computing features for evaluation...
Epoch: 1
coco val: {'txt_r1': 12.22, 'txt_r5': 28.74, 'txt_r10': 40.32, 'txt_r_mean': 27.093333333333334, 'img_r1': 5.881647341063575, 'img_r5': 18.10075969612155, 'img_r10': 27.608956417433028, 'img_r_mean': 17.197121151539385, 'r_mean': 22.14522724243636}
coco test: {'txt_r1': 11.34, 'txt_r5': 29.4, 'txt_r10': 40.32, 'txt_r_mean': 27.02, 'img_r1': 5.593762495001999, 'img_r5': 18.376649340263896, 'img_r10': 27.984806077568972, 'img_r_mean': 17.318405970944955, 'r_mean': 22.169202985472477}
Epoch: 2 iteration: 0 lr: 0.0002968314021064018 loss: -0.0604383647441864
tau_img: 0.0088, tau_txt: 0.0088
Epoch: 2 iteration: 50 lr: 0.0002968314021064018 loss: 0.23243539035320282
tau_img: 0.0088, tau_txt: 0.0088
Epoch: 2 iteration: 100 lr: 0.0002968314021064018 loss: 0.04821205139160156
tau_img: 0.0088, tau_txt: 0.0088
Epoch: 2 iteration: 150 lr: 0.0002968314021064018 loss: 0.21965868771076202
tau_img: 0.0088, tau_txt: 0.0088
Epoch: 2 iteration: 200 lr: 0.0002968314021064018 loss: 0.05134771019220352
tau_img: 0.0088, tau_txt: 0.0088
Epoch: 2 iteration: 250 lr: 0.0002968314021064018 loss: 0.1536252200603485
tau_img: 0.0088, tau_txt: 0.0088
Computing features for evaluation...
Computing features for evaluation...
Epoch: 2
coco val: {'txt_r1': 14.64, 'txt_r5': 35.0, 'txt_r10': 46.5, 'txt_r_mean': 32.04666666666667, 'img_r1': 7.97281087564974, 'img_r5': 22.898840463814473, 'img_r10': 33.77049180327869, 'img_r_mean': 21.547381047580966, 'r_mean': 26.79702385712382}
coco test: {'txt_r1': 15.14, 'txt_r5': 34.42, 'txt_r10': 46.54, 'txt_r_mean': 32.03333333333333, 'img_r1': 8.388644542183126, 'img_r5': 23.594562175129948, 'img_r10': 34.406237504998, 'img_r_mean': 22.12981474077036, 'r_mean': 27.081574037051844}
Epoch: 3 iteration: 0 lr: 0.00029290319486279724 loss: -0.29481595754623413
tau_img: 0.0083, tau_txt: 0.0081
Epoch: 3 iteration: 50 lr: 0.00029290319486279724 loss: 0.06638230383396149
tau_img: 0.0083, tau_txt: 0.0081
Epoch: 3 iteration: 100 lr: 0.00029290319486279724 loss: 0.03567551076412201
tau_img: 0.0083, tau_txt: 0.0082
Epoch: 3 iteration: 150 lr: 0.00029290319486279724 loss: 0.05767179653048515
tau_img: 0.0083, tau_txt: 0.0081
Epoch: 3 iteration: 200 lr: 0.00029290319486279724 loss: 0.056682661175727844
tau_img: 0.0083, tau_txt: 0.0082
Epoch: 3 iteration: 250 lr: 0.00029290319486279724 loss: 0.28257113695144653
tau_img: 0.0083, tau_txt: 0.0082
Computing features for evaluation...
Computing features for evaluation...
Epoch: 3
coco val: {'txt_r1': 15.9, 'txt_r5': 37.2, 'txt_r10': 49.18, 'txt_r_mean': 34.093333333333334, 'img_r1': 9.70811675329868, 'img_r5': 26.3734506197521, 'img_r10': 37.31707317073171, 'img_r_mean': 24.466213514594163, 'r_mean': 29.279773423963746}
coco test: {'txt_r1': 15.52, 'txt_r5': 37.28, 'txt_r10': 48.94, 'txt_r_mean': 33.913333333333334, 'img_r1': 9.660135945621752, 'img_r5': 26.66533386645342, 'img_r10': 37.49300279888045, 'img_r_mean': 24.606157536985204, 'r_mean': 29.259745435159267}
Epoch: 4 iteration: 0 lr: 0.00028746409135817707 loss: -0.2583860158920288
tau_img: 0.0079, tau_txt: 0.0077
Epoch: 4 iteration: 50 lr: 0.00028746409135817707 loss: 0.04029808193445206
tau_img: 0.0079, tau_txt: 0.0076
Epoch: 4 iteration: 100 lr: 0.00028746409135817707 loss: 0.11739009618759155
tau_img: 0.0079, tau_txt: 0.0076
Epoch: 4 iteration: 150 lr: 0.00028746409135817707 loss: 0.32731348276138306
tau_img: 0.0079, tau_txt: 0.0076
Epoch: 4 iteration: 200 lr: 0.00028746409135817707 loss: -0.00629810243844986
tau_img: 0.0079, tau_txt: 0.0076
Epoch: 4 iteration: 250 lr: 0.00028746409135817707 loss: 0.15173837542533875
tau_img: 0.0079, tau_txt: 0.0076
Computing features for evaluation...
Computing features for evaluation...
Epoch: 4
coco val: {'txt_r1': 17.16, 'txt_r5': 38.44, 'txt_r10': 50.34, 'txt_r_mean': 35.31333333333333, 'img_r1': 10.903638544582167, 'img_r5': 27.86485405837665, 'img_r10': 39.40423830467813, 'img_r_mean': 26.057576969212317, 'r_mean': 30.685455151272826}
coco test: {'txt_r1': 17.0, 'txt_r5': 37.84, 'txt_r10': 50.16, 'txt_r_mean': 35.0, 'img_r1': 10.415833666533386, 'img_r5': 28.58856457417033, 'img_r10': 40.26789284286286, 'img_r_mean': 26.424097027855524, 'r_mean': 30.712048513927762}
Epoch: 5 iteration: 0 lr: 0.0002805736835487436 loss: -0.4848897457122803
tau_img: 0.0075, tau_txt: 0.0072
Epoch: 5 iteration: 50 lr: 0.0002805736835487436 loss: 0.06531377136707306
tau_img: 0.0075, tau_txt: 0.0072
Epoch: 5 iteration: 100 lr: 0.0002805736835487436 loss: 0.09321524202823639
tau_img: 0.0075, tau_txt: 0.0072
Epoch: 5 iteration: 150 lr: 0.0002805736835487436 loss: 0.218039870262146
tau_img: 0.0075, tau_txt: 0.0073
Epoch: 5 iteration: 200 lr: 0.0002805736835487436 loss: 0.1558637171983719
tau_img: 0.0075, tau_txt: 0.0072
Epoch: 5 iteration: 250 lr: 0.0002805736835487436 loss: -0.09588228911161423
tau_img: 0.0075, tau_txt: 0.0072
Computing features for evaluation...
Computing features for evaluation...
Epoch: 5
coco val: {'txt_r1': 18.54, 'txt_r5': 40.0, 'txt_r10': 51.6, 'txt_r_mean': 36.71333333333333, 'img_r1': 11.015593762495001, 'img_r5': 28.984406237505, 'img_r10': 40.42782886845262, 'img_r_mean': 26.809276289484206, 'r_mean': 31.76130481140877}
coco test: {'txt_r1': 16.56, 'txt_r5': 38.8, 'txt_r10': 51.22, 'txt_r_mean': 35.526666666666664, 'img_r1': 11.107556977209116, 'img_r5': 29.072371051579367, 'img_r10': 40.77169132347061, 'img_r_mean': 26.983873117419694, 'r_mean': 31.25526989204318}
Epoch: 6 iteration: 0 lr: 0.0002723074641843674 loss: -0.5769622325897217
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 6 iteration: 50 lr: 0.0002723074641843674 loss: 0.37227633595466614
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 6 iteration: 100 lr: 0.0002723074641843674 loss: 0.06294765323400497
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 6 iteration: 150 lr: 0.0002723074641843674 loss: -0.028086403384804726
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 6 iteration: 200 lr: 0.0002723074641843674 loss: 0.08182275295257568
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 6 iteration: 250 lr: 0.0002723074641843674 loss: 0.16375750303268433
tau_img: 0.0072, tau_txt: 0.0069
Computing features for evaluation...
Computing features for evaluation...
Epoch: 6
coco val: {'txt_r1': 18.02, 'txt_r5': 40.82, 'txt_r10': 53.12, 'txt_r_mean': 37.32, 'img_r1': 11.431427429028389, 'img_r5': 29.748100759696122, 'img_r10': 41.47940823670532, 'img_r_mean': 27.55297880847661, 'r_mean': 32.4364894042383}
coco test: {'txt_r1': 17.68, 'txt_r5': 40.18, 'txt_r10': 52.56, 'txt_r_mean': 36.806666666666665, 'img_r1': 11.75529788084766, 'img_r5': 30.151939224310276, 'img_r10': 41.89924030387845, 'img_r_mean': 27.935492469678792, 'r_mean': 32.37107956817273}
Epoch: 7 iteration: 0 lr: 0.00026275599969422214 loss: -0.4518427550792694
tau_img: 0.0070, tau_txt: 0.0067
Epoch: 7 iteration: 50 lr: 0.00026275599969422214 loss: 0.2819710075855255
tau_img: 0.0070, tau_txt: 0.0067
Epoch: 7 iteration: 100 lr: 0.00026275599969422214 loss: 0.05290326103568077
tau_img: 0.0070, tau_txt: 0.0067
Epoch: 7 iteration: 150 lr: 0.00026275599969422214 loss: -0.008920110762119293
tau_img: 0.0070, tau_txt: 0.0067
Epoch: 7 iteration: 200 lr: 0.00026275599969422214 loss: 0.2930781841278076
tau_img: 0.0070, tau_txt: 0.0067
Epoch: 7 iteration: 250 lr: 0.00026275599969422214 loss: 0.14736725389957428
tau_img: 0.0070, tau_txt: 0.0067
Computing features for evaluation...
Computing features for evaluation...
Epoch: 7
coco val: {'txt_r1': 17.88, 'txt_r5': 40.54, 'txt_r10': 52.78, 'txt_r_mean': 37.06666666666667, 'img_r1': 11.571371451419433, 'img_r5': 30.023990403838464, 'img_r10': 41.543382646941225, 'img_r_mean': 27.71291483406638, 'r_mean': 32.38979075036652}
coco test: {'txt_r1': 18.14, 'txt_r5': 39.58, 'txt_r10': 51.58, 'txt_r_mean': 36.43333333333333, 'img_r1': 12.167133146741303, 'img_r5': 30.851659336265495, 'img_r10': 42.4390243902439, 'img_r_mean': 28.485938957750232, 'r_mean': 32.45963614554178}
Epoch: 8 iteration: 0 lr: 0.0002520239379220344 loss: -0.36706972122192383
tau_img: 0.0068, tau_txt: 0.0065
Epoch: 8 iteration: 50 lr: 0.0002520239379220344 loss: -0.229108527302742
tau_img: 0.0068, tau_txt: 0.0065
Epoch: 8 iteration: 100 lr: 0.0002520239379220344 loss: 0.31043940782546997
tau_img: 0.0068, tau_txt: 0.0065
Epoch: 8 iteration: 150 lr: 0.0002520239379220344 loss: 0.00404047966003418
tau_img: 0.0069, tau_txt: 0.0066
Epoch: 8 iteration: 200 lr: 0.0002520239379220344 loss: -0.24809685349464417
tau_img: 0.0069, tau_txt: 0.0066
Epoch: 8 iteration: 250 lr: 0.0002520239379220344 loss: -0.2770186960697174
tau_img: 0.0068, tau_txt: 0.0065
Computing features for evaluation...
Computing features for evaluation...
Epoch: 8
coco val: {'txt_r1': 16.92, 'txt_r5': 38.66, 'txt_r10': 51.2, 'txt_r_mean': 35.593333333333334, 'img_r1': 11.38344662135146, 'img_r5': 29.760095961615352, 'img_r10': 41.63934426229508, 'img_r_mean': 27.5942956150873, 'r_mean': 31.59381447421032}
coco test: {'txt_r1': 17.36, 'txt_r5': 38.22, 'txt_r10': 50.44, 'txt_r_mean': 35.339999999999996, 'img_r1': 11.82327069172331, 'img_r5': 30.3718512594962, 'img_r10': 41.74330267892843, 'img_r_mean': 27.979474876715983, 'r_mean': 31.65973743835799}
Epoch: 9 iteration: 0 lr: 0.00024022886158240857 loss: -0.7354167699813843
tau_img: 0.0067, tau_txt: 0.0064
Epoch: 9 iteration: 50 lr: 0.00024022886158240857 loss: -0.14618906378746033
tau_img: 0.0067, tau_txt: 0.0064
Epoch: 9 iteration: 100 lr: 0.00024022886158240857 loss: 0.12334905564785004
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 9 iteration: 150 lr: 0.00024022886158240857 loss: -0.45143190026283264
tau_img: 0.0067, tau_txt: 0.0065
Epoch: 9 iteration: 200 lr: 0.00024022886158240857 loss: 0.06901969015598297
tau_img: 0.0067, tau_txt: 0.0065
Epoch: 9 iteration: 250 lr: 0.00024022886158240857 loss: 0.02915862947702408
tau_img: 0.0067, tau_txt: 0.0064
Computing features for evaluation...
Computing features for evaluation...
Epoch: 9
coco val: {'txt_r1': 17.24, 'txt_r5': 39.68, 'txt_r10': 52.52, 'txt_r_mean': 36.48, 'img_r1': 11.943222710915634, 'img_r5': 30.279888044782087, 'img_r10': 42.059176329468215, 'img_r_mean': 28.094095695055312, 'r_mean': 32.28704784752765}
coco test: {'txt_r1': 17.64, 'txt_r5': 39.44, 'txt_r10': 50.9, 'txt_r_mean': 35.99333333333333, 'img_r1': 11.975209916033586, 'img_r5': 30.463814474210317, 'img_r10': 41.97920831667333, 'img_r_mean': 28.13941090230574, 'r_mean': 32.06637211781954}
Epoch: 10 iteration: 0 lr: 0.00022749999999999997 loss: -0.9465005993843079
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 10 iteration: 50 lr: 0.00022749999999999997 loss: -0.1919674426317215
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 10 iteration: 100 lr: 0.00022749999999999997 loss: 0.0656488761305809
tau_img: 0.0066, tau_txt: 0.0063
Epoch: 10 iteration: 150 lr: 0.00022749999999999997 loss: 0.15473569929599762
tau_img: 0.0066, tau_txt: 0.0063
Epoch: 10 iteration: 200 lr: 0.00022749999999999997 loss: 0.048671215772628784
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 10 iteration: 250 lr: 0.00022749999999999997 loss: 0.05919775739312172
tau_img: 0.0066, tau_txt: 0.0063
Computing features for evaluation...
Computing features for evaluation...
Epoch: 10
coco val: {'txt_r1': 17.54, 'txt_r5': 39.96, 'txt_r10': 52.46, 'txt_r_mean': 36.653333333333336, 'img_r1': 12.039184326269492, 'img_r5': 30.89564174330268, 'img_r10': 42.55897640943623, 'img_r_mean': 28.497934159669466, 'r_mean': 32.5756337465014}
coco test: {'txt_r1': 17.24, 'txt_r5': 38.94, 'txt_r10': 51.24, 'txt_r_mean': 35.806666666666665, 'img_r1': 12.191123550579768, 'img_r5': 30.947620951619353, 'img_r10': 42.958816473410636, 'img_r_mean': 28.69918699186992, 'r_mean': 32.25292682926829}
Epoch: 11 iteration: 0 lr: 0.00021397681324599103 loss: -0.8527200222015381
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 11 iteration: 50 lr: 0.00021397681324599103 loss: -0.310724675655365
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 11 iteration: 100 lr: 0.00021397681324599103 loss: -0.18071337044239044
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 11 iteration: 150 lr: 0.00021397681324599103 loss: -0.15896828472614288
tau_img: 0.0067, tau_txt: 0.0064
Epoch: 11 iteration: 200 lr: 0.00021397681324599103 loss: 0.125459223985672
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 11 iteration: 250 lr: 0.00021397681324599103 loss: 0.005948394536972046
tau_img: 0.0066, tau_txt: 0.0064
Computing features for evaluation...
Computing features for evaluation...
Epoch: 11
coco val: {'txt_r1': 18.22, 'txt_r5': 40.72, 'txt_r10': 53.08, 'txt_r_mean': 37.339999999999996, 'img_r1': 12.367053178728508, 'img_r5': 31.231507397041185, 'img_r10': 42.890843662534984, 'img_r_mean': 28.829801412768223, 'r_mean': 33.08490070638411}
coco test: {'txt_r1': 19.12, 'txt_r5': 40.38, 'txt_r10': 52.52, 'txt_r_mean': 37.34, 'img_r1': 12.29908036785286, 'img_r5': 31.215513794482206, 'img_r10': 43.082766893242706, 'img_r_mean': 28.865787018525925, 'r_mean': 33.10289350926296}
Epoch: 12 iteration: 0 lr: 0.00019980746418436736 loss: -0.8759943246841431
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 12 iteration: 50 lr: 0.00019980746418436736 loss: -0.6733912229537964
tau_img: 0.0067, tau_txt: 0.0064
Epoch: 12 iteration: 100 lr: 0.00019980746418436736 loss: 0.007951691746711731
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 12 iteration: 150 lr: 0.00019980746418436736 loss: -0.27293896675109863
tau_img: 0.0066, tau_txt: 0.0064
Epoch: 12 iteration: 200 lr: 0.00019980746418436736 loss: -0.604184627532959
tau_img: 0.0067, tau_txt: 0.0065
Epoch: 12 iteration: 250 lr: 0.00019980746418436736 loss: -0.08432623744010925
tau_img: 0.0066, tau_txt: 0.0064
Computing features for evaluation...
Computing features for evaluation...
Epoch: 12
coco val: {'txt_r1': 18.26, 'txt_r5': 40.38, 'txt_r10': 53.12, 'txt_r_mean': 37.25333333333333, 'img_r1': 12.522990803678528, 'img_r5': 31.70731707317073, 'img_r10': 43.122750899640145, 'img_r_mean': 29.117686258829803, 'r_mean': 33.18550979608157}
coco test: {'txt_r1': 17.34, 'txt_r5': 39.08, 'txt_r10': 52.32, 'txt_r_mean': 36.24666666666667, 'img_r1': 12.798880447820872, 'img_r5': 31.759296281487405, 'img_r10': 43.05477808876449, 'img_r_mean': 29.204318272690927, 'r_mean': 32.7254924696788}
Epoch: 13 iteration: 0 lr: 0.00018514719516857505 loss: -1.3101189136505127
tau_img: 0.0069, tau_txt: 0.0066
Epoch: 13 iteration: 50 lr: 0.00018514719516857505 loss: -0.5373433828353882
tau_img: 0.0068, tau_txt: 0.0065
Epoch: 13 iteration: 100 lr: 0.00018514719516857505 loss: -0.2286771833896637
tau_img: 0.0068, tau_txt: 0.0065
Epoch: 13 iteration: 150 lr: 0.00018514719516857505 loss: -0.17678964138031006
tau_img: 0.0067, tau_txt: 0.0064
Epoch: 13 iteration: 200 lr: 0.00018514719516857505 loss: -0.24495404958724976
tau_img: 0.0068, tau_txt: 0.0066
Epoch: 13 iteration: 250 lr: 0.00018514719516857505 loss: -0.5934573411941528
tau_img: 0.0068, tau_txt: 0.0066
Computing features for evaluation...
Computing features for evaluation...
Epoch: 13
coco val: {'txt_r1': 19.1, 'txt_r5': 40.84, 'txt_r10': 53.04, 'txt_r_mean': 37.660000000000004, 'img_r1': 12.538984406237505, 'img_r5': 31.36345461815274, 'img_r10': 42.94282287085166, 'img_r_mean': 28.9484206317473, 'r_mean': 33.304210315873654}
coco test: {'txt_r1': 18.26, 'txt_r5': 40.74, 'txt_r10': 53.12, 'txt_r_mean': 37.373333333333335, 'img_r1': 12.810875649740105, 'img_r5': 31.955217912834865, 'img_r10': 43.398640543782484, 'img_r_mean': 29.388244702119152, 'r_mean': 33.380789017726244}
Epoch: 14 iteration: 0 lr: 0.00017015662717380974 loss: -1.136932611465454
tau_img: 0.0069, tau_txt: 0.0067
Epoch: 14 iteration: 50 lr: 0.00017015662717380974 loss: -1.2352209091186523
tau_img: 0.0071, tau_txt: 0.0068
Epoch: 14 iteration: 100 lr: 0.00017015662717380974 loss: -0.3656700551509857
tau_img: 0.0069, tau_txt: 0.0067
Epoch: 14 iteration: 150 lr: 0.00017015662717380974 loss: -0.7482412457466125
tau_img: 0.0068, tau_txt: 0.0066
Epoch: 14 iteration: 200 lr: 0.00017015662717380974 loss: -0.6269024014472961
tau_img: 0.0070, tau_txt: 0.0068
Epoch: 14 iteration: 250 lr: 0.00017015662717380974 loss: -0.8550422191619873
tau_img: 0.0070, tau_txt: 0.0067
Computing features for evaluation...
Computing features for evaluation...
Epoch: 14
coco val: {'txt_r1': 18.52, 'txt_r5': 39.8, 'txt_r10': 52.78, 'txt_r_mean': 37.03333333333333, 'img_r1': 12.758896441423431, 'img_r5': 32.279088364654136, 'img_r10': 44.17433026789284, 'img_r_mean': 29.73743835799014, 'r_mean': 33.38538584566174}
coco test: {'txt_r1': 17.52, 'txt_r5': 39.96, 'txt_r10': 51.94, 'txt_r_mean': 36.473333333333336, 'img_r1': 12.902838864454218, 'img_r5': 31.887245101959216, 'img_r10': 43.750499800079965, 'img_r_mean': 29.513527922164467, 'r_mean': 32.9934306277489}
Epoch: 15 iteration: 0 lr: 0.000155 loss: -1.8559613227844238
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 15 iteration: 50 lr: 0.000155 loss: -1.2427170276641846
tau_img: 0.0073, tau_txt: 0.0070
Epoch: 15 iteration: 100 lr: 0.000155 loss: -1.1395246982574463
tau_img: 0.0072, tau_txt: 0.0070
Epoch: 15 iteration: 150 lr: 0.000155 loss: -1.4752817153930664
tau_img: 0.0072, tau_txt: 0.0069
Epoch: 15 iteration: 200 lr: 0.000155 loss: -1.8828952312469482
tau_img: 0.0072, tau_txt: 0.0070
Epoch: 15 iteration: 250 lr: 0.000155 loss: -1.181127905845642
tau_img: 0.0072, tau_txt: 0.0070
Computing features for evaluation...
Computing features for evaluation...
Epoch: 15
coco val: {'txt_r1': 19.9, 'txt_r5': 43.36, 'txt_r10': 55.22, 'txt_r_mean': 39.49333333333333, 'img_r1': 13.478608556577369, 'img_r5': 32.810875649740105, 'img_r10': 44.40223910435826, 'img_r_mean': 30.230574436891914, 'r_mean': 34.86195388511263}
coco test: {'txt_r1': 19.58, 'txt_r5': 43.1, 'txt_r10': 54.84, 'txt_r_mean': 39.17333333333334, 'img_r1': 13.642542982806876, 'img_r5': 33.218712514994, 'img_r10': 44.718112754898044, 'img_r_mean': 30.526456084232976, 'r_mean': 34.849894708783154}
Epoch: 16 iteration: 0 lr: 0.00013984337282619026 loss: -2.054107189178467
tau_img: 0.0073, tau_txt: 0.0072
Epoch: 16 iteration: 50 lr: 0.00013984337282619026 loss: -1.3603992462158203
tau_img: 0.0073, tau_txt: 0.0071
Epoch: 16 iteration: 100 lr: 0.00013984337282619026 loss: -1.8992851972579956
tau_img: 0.0074, tau_txt: 0.0071
Epoch: 16 iteration: 150 lr: 0.00013984337282619026 loss: -1.8692710399627686
tau_img: 0.0074, tau_txt: 0.0072
Epoch: 16 iteration: 200 lr: 0.00013984337282619026 loss: -1.7104038000106812
tau_img: 0.0075, tau_txt: 0.0072
Epoch: 16 iteration: 250 lr: 0.00013984337282619026 loss: -1.380126953125
tau_img: 0.0073, tau_txt: 0.0071
Computing features for evaluation...
Computing features for evaluation...
Epoch: 16
coco val: {'txt_r1': 20.58, 'txt_r5': 43.24, 'txt_r10': 55.3, 'txt_r_mean': 39.70666666666667, 'img_r1': 13.15873650539784, 'img_r5': 32.99480207916833, 'img_r10': 44.586165533786485, 'img_r_mean': 30.24656803945088, 'r_mean': 34.97661735305878}
coco test: {'txt_r1': 19.36, 'txt_r5': 42.4, 'txt_r10': 54.48, 'txt_r_mean': 38.74666666666666, 'img_r1': 13.666533386645343, 'img_r5': 33.2546981207517, 'img_r10': 44.65413834466214, 'img_r_mean': 30.525123284019724, 'r_mean': 34.63589497534319}
Epoch: 17 iteration: 0 lr: 0.00012485280483142487 loss: -2.5637669563293457
tau_img: 0.0075, tau_txt: 0.0073
Epoch: 17 iteration: 50 lr: 0.00012485280483142487 loss: -2.191415309906006
tau_img: 0.0078, tau_txt: 0.0075
Epoch: 17 iteration: 100 lr: 0.00012485280483142487 loss: -2.321763515472412
tau_img: 0.0077, tau_txt: 0.0074
Epoch: 17 iteration: 150 lr: 0.00012485280483142487 loss: -1.8449326753616333
tau_img: 0.0075, tau_txt: 0.0073
Epoch: 17 iteration: 200 lr: 0.00012485280483142487 loss: -2.31805157661438
tau_img: 0.0077, tau_txt: 0.0075
Epoch: 17 iteration: 250 lr: 0.00012485280483142487 loss: -2.372451066970825
tau_img: 0.0075, tau_txt: 0.0073
Computing features for evaluation...
Computing features for evaluation...
Epoch: 17
coco val: {'txt_r1': 19.98, 'txt_r5': 42.34, 'txt_r10': 54.76, 'txt_r_mean': 39.02666666666667, 'img_r1': 13.554578168732506, 'img_r5': 33.04278288684526, 'img_r10': 44.470211915233904, 'img_r_mean': 30.35585765693722, 'r_mean': 34.691262161801944}
coco test: {'txt_r1': 19.42, 'txt_r5': 42.38, 'txt_r10': 54.96, 'txt_r_mean': 38.92, 'img_r1': 13.838464614154338, 'img_r5': 33.334666133546584, 'img_r10': 44.92602958816473, 'img_r_mean': 30.69972011195522, 'r_mean': 34.80986005597761}
Epoch: 18 iteration: 0 lr: 0.00011019253581563262 loss: -4.260552406311035
tau_img: 0.0081, tau_txt: 0.0079
Epoch: 18 iteration: 50 lr: 0.00011019253581563262 loss: -2.9299917221069336
tau_img: 0.0081, tau_txt: 0.0078
Epoch: 18 iteration: 100 lr: 0.00011019253581563262 loss: -3.3400635719299316
tau_img: 0.0080, tau_txt: 0.0077
Epoch: 18 iteration: 150 lr: 0.00011019253581563262 loss: -3.453747510910034
tau_img: 0.0079, tau_txt: 0.0077
Epoch: 18 iteration: 200 lr: 0.00011019253581563262 loss: -3.1733462810516357
tau_img: 0.0081, tau_txt: 0.0078
Epoch: 18 iteration: 250 lr: 0.00011019253581563262 loss: -2.6329762935638428
tau_img: 0.0079, tau_txt: 0.0076
Computing features for evaluation...
Computing features for evaluation...
Epoch: 18
coco val: {'txt_r1': 20.76, 'txt_r5': 43.36, 'txt_r10': 55.6, 'txt_r_mean': 39.906666666666666, 'img_r1': 14.226309476209517, 'img_r5': 33.80647740903638, 'img_r10': 45.27788884446222, 'img_r_mean': 31.103558576569373, 'r_mean': 35.50511262161802}
coco test: {'txt_r1': 20.6, 'txt_r5': 43.26, 'txt_r10': 55.16, 'txt_r_mean': 39.67333333333333, 'img_r1': 14.406237504998002, 'img_r5': 34.25029988004798, 'img_r10': 45.76169532187125, 'img_r_mean': 31.47274423563908, 'r_mean': 35.57303878448621}
Epoch: 19 iteration: 0 lr: 9.602318675400897e-05 loss: -4.915426254272461
tau_img: 0.0085, tau_txt: 0.0082
Epoch: 19 iteration: 50 lr: 9.602318675400897e-05 loss: -3.8118224143981934
tau_img: 0.0083, tau_txt: 0.0082
Epoch: 19 iteration: 100 lr: 9.602318675400897e-05 loss: -3.6978960037231445
tau_img: 0.0083, tau_txt: 0.0080
Epoch: 19 iteration: 150 lr: 9.602318675400897e-05 loss: -3.7106001377105713
tau_img: 0.0082, tau_txt: 0.0080
Epoch: 19 iteration: 200 lr: 9.602318675400897e-05 loss: -4.195495128631592
tau_img: 0.0083, tau_txt: 0.0080
Epoch: 19 iteration: 250 lr: 9.602318675400897e-05 loss: -4.262701034545898
tau_img: 0.0083, tau_txt: 0.0081
Computing features for evaluation...
Computing features for evaluation...
Epoch: 19
coco val: {'txt_r1': 19.98, 'txt_r5': 43.22, 'txt_r10': 55.22, 'txt_r_mean': 39.473333333333336, 'img_r1': 14.058376649340264, 'img_r5': 33.310675729708116, 'img_r10': 44.96601359456218, 'img_r_mean': 30.778355324536857, 'r_mean': 35.125844328935095}
coco test: {'txt_r1': 19.94, 'txt_r5': 43.16, 'txt_r10': 55.72, 'txt_r_mean': 39.60666666666666, 'img_r1': 13.94642143142743, 'img_r5': 33.65053978408636, 'img_r10': 45.33386645341863, 'img_r_mean': 30.976942556310807, 'r_mean': 35.291804611488736}
Epoch: 20 iteration: 0 lr: 8.250000000000001e-05 loss: -4.490512371063232
tau_img: 0.0085, tau_txt: 0.0084
Epoch: 20 iteration: 50 lr: 8.250000000000001e-05 loss: -5.540229320526123
tau_img: 0.0088, tau_txt: 0.0085
Epoch: 20 iteration: 100 lr: 8.250000000000001e-05 loss: -5.427042484283447
tau_img: 0.0088, tau_txt: 0.0085
Epoch: 20 iteration: 150 lr: 8.250000000000001e-05 loss: -5.009304046630859
tau_img: 0.0087, tau_txt: 0.0085
Epoch: 20 iteration: 200 lr: 8.250000000000001e-05 loss: -5.154559135437012
tau_img: 0.0088, tau_txt: 0.0084
Epoch: 20 iteration: 250 lr: 8.250000000000001e-05 loss: -5.245851993560791
tau_img: 0.0087, tau_txt: 0.0085
Computing features for evaluation...
Computing features for evaluation...
Epoch: 20
coco val: {'txt_r1': 21.16, 'txt_r5': 43.64, 'txt_r10': 55.96, 'txt_r_mean': 40.25333333333333, 'img_r1': 13.914434226309476, 'img_r5': 33.954418232706914, 'img_r10': 45.64574170331867, 'img_r_mean': 31.171531387445018, 'r_mean': 35.71243236038917}
coco test: {'txt_r1': 20.46, 'txt_r5': 43.9, 'txt_r10': 55.6, 'txt_r_mean': 39.98666666666667, 'img_r1': 14.166333466613354, 'img_r5': 34.44622151139544, 'img_r10': 45.7936825269892, 'img_r_mean': 31.46874583499933, 'r_mean': 35.727706250833}
Epoch: 21 iteration: 0 lr: 6.97711384175914e-05 loss: -6.665648460388184
tau_img: 0.0093, tau_txt: 0.0089
Epoch: 21 iteration: 50 lr: 6.97711384175914e-05 loss: -5.873527526855469
tau_img: 0.0089, tau_txt: 0.0088
Epoch: 21 iteration: 100 lr: 6.97711384175914e-05 loss: -6.627588272094727
tau_img: 0.0091, tau_txt: 0.0090
Epoch: 21 iteration: 150 lr: 6.97711384175914e-05 loss: -6.532419204711914
tau_img: 0.0093, tau_txt: 0.0091
Epoch: 21 iteration: 200 lr: 6.97711384175914e-05 loss: -6.612300395965576
tau_img: 0.0092, tau_txt: 0.0090
Epoch: 21 iteration: 250 lr: 6.97711384175914e-05 loss: -5.026062965393066
tau_img: 0.0088, tau_txt: 0.0085
Computing features for evaluation...
Computing features for evaluation...
Epoch: 21
coco val: {'txt_r1': 21.2, 'txt_r5': 42.88, 'txt_r10': 55.18, 'txt_r_mean': 39.75333333333333, 'img_r1': 13.858456617353058, 'img_r5': 33.5265893642543, 'img_r10': 45.12994802079168, 'img_r_mean': 30.838331334133013, 'r_mean': 35.29583233373317}
coco test: {'txt_r1': 19.56, 'txt_r5': 42.92, 'txt_r10': 54.92, 'txt_r_mean': 39.13333333333333, 'img_r1': 14.082367053178729, 'img_r5': 33.506597361055576, 'img_r10': 45.16993202718913, 'img_r_mean': 30.919632147141144, 'r_mean': 35.02648274023724}
Epoch: 22 iteration: 0 lr: 5.797606207796559e-05 loss: -7.0506157875061035
tau_img: 0.0095, tau_txt: 0.0091
Epoch: 22 iteration: 50 lr: 5.797606207796559e-05 loss: -7.07581901550293
tau_img: 0.0093, tau_txt: 0.0090
Epoch: 22 iteration: 100 lr: 5.797606207796559e-05 loss: -7.153095245361328
tau_img: 0.0096, tau_txt: 0.0093
Epoch: 22 iteration: 150 lr: 5.797606207796559e-05 loss: -7.888920307159424
tau_img: 0.0096, tau_txt: 0.0094
Epoch: 22 iteration: 200 lr: 5.797606207796559e-05 loss: -6.130715847015381
tau_img: 0.0092, tau_txt: 0.0090
Epoch: 22 iteration: 250 lr: 5.797606207796559e-05 loss: -6.484936714172363
tau_img: 0.0093, tau_txt: 0.0089
Computing features for evaluation...
Computing features for evaluation...
Epoch: 22
coco val: {'txt_r1': 20.78, 'txt_r5': 43.78, 'txt_r10': 55.14, 'txt_r_mean': 39.9, 'img_r1': 14.338264694122351, 'img_r5': 34.038384646141544, 'img_r10': 45.71771291483407, 'img_r_mean': 31.36478741836599, 'r_mean': 35.63239370918299}
coco test: {'txt_r1': 20.3, 'txt_r5': 42.74, 'txt_r10': 55.12, 'txt_r_mean': 39.38666666666666, 'img_r1': 14.326269492203119, 'img_r5': 34.3062774890044, 'img_r10': 45.649740103958415, 'img_r_mean': 31.427429028388644, 'r_mean': 35.40704784752765}
Epoch: 23 iteration: 0 lr: 4.724400030577786e-05 loss: -9.242505073547363
tau_img: 0.0100, tau_txt: 0.0099
Epoch: 23 iteration: 50 lr: 4.724400030577786e-05 loss: -8.627782821655273
tau_img: 0.0097, tau_txt: 0.0094
Epoch: 23 iteration: 100 lr: 4.724400030577786e-05 loss: -8.229507446289062
tau_img: 0.0098, tau_txt: 0.0095
Epoch: 23 iteration: 150 lr: 4.724400030577786e-05 loss: -8.095161437988281
tau_img: 0.0101, tau_txt: 0.0099
Epoch: 23 iteration: 200 lr: 4.724400030577786e-05 loss: -7.361606597900391
tau_img: 0.0099, tau_txt: 0.0096
Epoch: 23 iteration: 250 lr: 4.724400030577786e-05 loss: -8.183349609375
tau_img: 0.0096, tau_txt: 0.0095
Computing features for evaluation...
Computing features for evaluation...
Epoch: 23
coco val: {'txt_r1': 20.56, 'txt_r5': 43.82, 'txt_r10': 55.32, 'txt_r_mean': 39.9, 'img_r1': 14.066373450619752, 'img_r5': 33.7984806077569, 'img_r10': 45.86965213914434, 'img_r_mean': 31.244835399173667, 'r_mean': 35.572417699586836}
coco test: {'txt_r1': 19.68, 'txt_r5': 43.02, 'txt_r10': 54.9, 'txt_r_mean': 39.199999999999996, 'img_r1': 14.374250299880048, 'img_r5': 34.16233506597361, 'img_r10': 45.71771291483407, 'img_r_mean': 31.418099426895907, 'r_mean': 35.30904971344795}
Epoch: 24 iteration: 0 lr: 3.769253581563263e-05 loss: -10.245454788208008
tau_img: 0.0102, tau_txt: 0.0099
Epoch: 24 iteration: 50 lr: 3.769253581563263e-05 loss: -9.013447761535645
tau_img: 0.0102, tau_txt: 0.0100
Epoch: 24 iteration: 100 lr: 3.769253581563263e-05 loss: -10.611595153808594
tau_img: 0.0104, tau_txt: 0.0101
Epoch: 24 iteration: 150 lr: 3.769253581563263e-05 loss: -8.743675231933594
tau_img: 0.0102, tau_txt: 0.0102
Epoch: 24 iteration: 200 lr: 3.769253581563263e-05 loss: -8.715897560119629
tau_img: 0.0102, tau_txt: 0.0099
Epoch: 24 iteration: 250 lr: 3.769253581563263e-05 loss: -10.123720169067383
tau_img: 0.0102, tau_txt: 0.0101
Computing features for evaluation...
Computing features for evaluation...
Epoch: 24
coco val: {'txt_r1': 20.3, 'txt_r5': 43.44, 'txt_r10': 54.48, 'txt_r_mean': 39.406666666666666, 'img_r1': 14.110355857656938, 'img_r5': 33.662534986005596, 'img_r10': 45.59776089564174, 'img_r_mean': 31.123550579768093, 'r_mean': 35.26510862321738}
coco test: {'txt_r1': 19.24, 'txt_r5': 42.34, 'txt_r10': 55.02, 'txt_r_mean': 38.86666666666667, 'img_r1': 14.466213514594163, 'img_r5': 33.75449820071971, 'img_r10': 45.529788084766096, 'img_r_mean': 31.250166600026656, 'r_mean': 35.05841663334666}
Epoch: 25 iteration: 0 lr: 2.9426316451256386e-05 loss: -11.852662086486816
tau_img: 0.0108, tau_txt: 0.0105
Epoch: 25 iteration: 50 lr: 2.9426316451256386e-05 loss: -11.105792045593262
tau_img: 0.0108, tau_txt: 0.0105
Epoch: 25 iteration: 100 lr: 2.9426316451256386e-05 loss: -9.328715324401855
tau_img: 0.0103, tau_txt: 0.0100
Epoch: 25 iteration: 150 lr: 2.9426316451256386e-05 loss: -10.47180461883545
tau_img: 0.0105, tau_txt: 0.0101
Epoch: 25 iteration: 200 lr: 2.9426316451256386e-05 loss: -9.260772705078125
tau_img: 0.0104, tau_txt: 0.0103
Epoch: 25 iteration: 250 lr: 2.9426316451256386e-05 loss: -10.207618713378906
tau_img: 0.0103, tau_txt: 0.0102
Computing features for evaluation...
Computing features for evaluation...
Epoch: 25
coco val: {'txt_r1': 20.44, 'txt_r5': 43.78, 'txt_r10': 55.58, 'txt_r_mean': 39.93333333333333, 'img_r1': 14.146341463414634, 'img_r5': 33.81447421031587, 'img_r10': 46.00559776089564, 'img_r_mean': 31.32213781154205, 'r_mean': 35.627735572437686}
coco test: {'txt_r1': 19.66, 'txt_r5': 42.9, 'txt_r10': 55.24, 'txt_r_mean': 39.26666666666667, 'img_r1': 14.47421031587365, 'img_r5': 34.27828868452619, 'img_r10': 45.725709716113556, 'img_r_mean': 31.492736238837796, 'r_mean': 35.379701452752236}
Epoch: 26 iteration: 0 lr: 2.2535908641822855e-05 loss: -10.570426940917969
tau_img: 0.0106, tau_txt: 0.0105
Epoch: 26 iteration: 50 lr: 2.2535908641822855e-05 loss: -11.204402923583984
tau_img: 0.0110, tau_txt: 0.0107
Epoch: 26 iteration: 100 lr: 2.2535908641822855e-05 loss: -12.513148307800293
tau_img: 0.0110, tau_txt: 0.0108
Epoch: 26 iteration: 150 lr: 2.2535908641822855e-05 loss: -11.783784866333008
tau_img: 0.0110, tau_txt: 0.0108
Epoch: 26 iteration: 200 lr: 2.2535908641822855e-05 loss: -11.702966690063477
tau_img: 0.0111, tau_txt: 0.0107
Epoch: 26 iteration: 250 lr: 2.2535908641822855e-05 loss: -11.340032577514648
tau_img: 0.0111, tau_txt: 0.0110
Computing features for evaluation...
Computing features for evaluation...
Epoch: 26
coco val: {'txt_r1': 20.7, 'txt_r5': 43.74, 'txt_r10': 55.58, 'txt_r_mean': 40.00666666666667, 'img_r1': 14.134346261495402, 'img_r5': 33.78248700519792, 'img_r10': 45.657736905237904, 'img_r_mean': 31.19152339064374, 'r_mean': 35.599095028655206}
coco test: {'txt_r1': 19.48, 'txt_r5': 42.86, 'txt_r10': 55.32, 'txt_r_mean': 39.22, 'img_r1': 14.29828068772491, 'img_r5': 33.98640543782487, 'img_r10': 45.55377848860456, 'img_r_mean': 31.27948820471811, 'r_mean': 35.24974410235905}
Epoch: 27 iteration: 0 lr: 1.7096805137202738e-05 loss: -12.180134773254395
tau_img: 0.0114, tau_txt: 0.0113
Epoch: 27 iteration: 50 lr: 1.7096805137202738e-05 loss: -12.57005500793457
tau_img: 0.0112, tau_txt: 0.0110
Epoch: 27 iteration: 100 lr: 1.7096805137202738e-05 loss: -12.195676803588867
tau_img: 0.0115, tau_txt: 0.0113
Epoch: 27 iteration: 150 lr: 1.7096805137202738e-05 loss: -13.575706481933594
tau_img: 0.0116, tau_txt: 0.0113
Epoch: 27 iteration: 200 lr: 1.7096805137202738e-05 loss: -14.225406646728516
tau_img: 0.0115, tau_txt: 0.0113
Epoch: 27 iteration: 250 lr: 1.7096805137202738e-05 loss: -11.519415855407715
tau_img: 0.0113, tau_txt: 0.0111
Computing features for evaluation...
Computing features for evaluation...
Epoch: 27
coco val: {'txt_r1': 20.72, 'txt_r5': 44.04, 'txt_r10': 55.4, 'txt_r_mean': 40.053333333333335, 'img_r1': 14.186325469812076, 'img_r5': 33.71451419432227, 'img_r10': 45.63374650139944, 'img_r_mean': 31.178195388511266, 'r_mean': 35.6157643609223}
coco test: {'txt_r1': 19.42, 'txt_r5': 42.88, 'txt_r10': 55.08, 'txt_r_mean': 39.126666666666665, 'img_r1': 14.50219912035186, 'img_r5': 33.982407037185126, 'img_r10': 45.569772091163536, 'img_r_mean': 31.351459416233507, 'r_mean': 35.23906304145009}
Epoch: 28 iteration: 0 lr: 1.3168597893598175e-05 loss: -14.22984790802002
tau_img: 0.0116, tau_txt: 0.0115
Epoch: 28 iteration: 50 lr: 1.3168597893598175e-05 loss: -12.658186912536621
tau_img: 0.0117, tau_txt: 0.0115
Epoch: 28 iteration: 100 lr: 1.3168597893598175e-05 loss: -14.149580001831055
tau_img: 0.0117, tau_txt: 0.0114
Epoch: 28 iteration: 150 lr: 1.3168597893598175e-05 loss: -14.180305480957031
tau_img: 0.0119, tau_txt: 0.0115
Epoch: 28 iteration: 200 lr: 1.3168597893598175e-05 loss: -14.528634071350098
tau_img: 0.0121, tau_txt: 0.0118
Epoch: 28 iteration: 250 lr: 1.3168597893598175e-05 loss: -14.142889022827148
tau_img: 0.0120, tau_txt: 0.0116
Computing features for evaluation...
Computing features for evaluation...
Epoch: 28
coco val: {'txt_r1': 20.56, 'txt_r5': 43.92, 'txt_r10': 55.18, 'txt_r_mean': 39.88666666666666, 'img_r1': 14.378248700519793, 'img_r5': 33.990403838464616, 'img_r10': 45.81767293082767, 'img_r_mean': 31.39544182327069, 'r_mean': 35.64105424496868}
coco test: {'txt_r1': 19.56, 'txt_r5': 42.92, 'txt_r10': 55.0, 'txt_r_mean': 39.160000000000004, 'img_r1': 14.550179928028788, 'img_r5': 34.11435425829668, 'img_r10': 45.765693722510996, 'img_r_mean': 31.476742636278818, 'r_mean': 35.31837131813941}
Epoch: 29 iteration: 0 lr: 1.0794325171600358e-05 loss: -14.580052375793457
tau_img: 0.0120, tau_txt: 0.0117
Epoch: 29 iteration: 50 lr: 1.0794325171600358e-05 loss: -14.782979965209961
tau_img: 0.0124, tau_txt: 0.0122
Epoch: 29 iteration: 100 lr: 1.0794325171600358e-05 loss: -13.903106689453125
tau_img: 0.0121, tau_txt: 0.0118
Epoch: 29 iteration: 150 lr: 1.0794325171600358e-05 loss: -15.160087585449219
tau_img: 0.0125, tau_txt: 0.0121
Epoch: 29 iteration: 200 lr: 1.0794325171600358e-05 loss: -14.430315017700195
tau_img: 0.0118, tau_txt: 0.0117
Epoch: 29 iteration: 250 lr: 1.0794325171600358e-05 loss: -14.369138717651367
tau_img: 0.0120, tau_txt: 0.0118
Computing features for evaluation...
Computing features for evaluation...
Epoch: 29
coco val: {'txt_r1': 20.42, 'txt_r5': 43.82, 'txt_r10': 55.34, 'txt_r_mean': 39.86000000000001, 'img_r1': 14.234306277489004, 'img_r5': 33.750499800079965, 'img_r10': 45.48180727708917, 'img_r_mean': 31.155537784886047, 'r_mean': 35.507768892443025}
coco test: {'txt_r1': 19.4, 'txt_r5': 42.76, 'txt_r10': 55.08, 'txt_r_mean': 39.08, 'img_r1': 14.434226309476209, 'img_r5': 33.8984406237505, 'img_r10': 45.577768892443025, 'img_r_mean': 31.303478608556578, 'r_mean': 35.191739304278286}
Visualization
-------------
Here we demonstrate the training curves of the mean validation recall
values for CLIP and iSogCLR.
.. code:: python
clip_recall_vals = [9.56793, 26.4037, 29.3343, 29.7682, 30.5586, 30.8398, 30.8938, 31.5624, 30.5864, 31.1057, 31.775, 31.0977, 31.8895, 31.0098, 31.2123, 31.745, 31.8129, 32.0983, 31.163, 31.7523, 32.1975, 32.2302, 32.5441, 32.3621, 32.6608, 32.7541, 32.7528, 32.4954, 32.9994, 32.7601]
isogclr_recall_vals = [10.3831, 22.1452, 26.797, 29.2797, 30.6854, 31.7613, 32.4364, 32.3897, 31.5938, 32.287, 32.5756, 33.0849, 33.1855, 33.3042, 33.3853, 34.8619, 34.9766, 34.6912, 35.5051, 35.1258, 35.7124, 35.2958, 35.6323, 35.5724, 35.2651, 35.6277, 35.599, 35.6157, 35.641, 35.5077]
.. code:: python
import matplotlib.pyplot as plt
import numpy as np
epochs = np.arange(1, 31)
plt.plot(epochs, clip_recall_vals, label='CLIP', ls=':', marker='+', color='blue')
plt.plot(epochs, isogclr_recall_vals, label='iSogCLR', marker='*', color='orange')
plt.ylabel('Mean Validation Recall', fontsize=18)
plt.xlabel('Epoch', fontsize=18)
plt.title('CLIP vs. iSogCLR', fontsize=20)
plt.legend(fontsize=20)
plt.show()
.. image:: ./imgs/Bimodal_iSogCLR_Tutorial.png