余弦退火学习率 CosineAnnealingLR

1,270 阅读1分钟

CosineAnnealingLR

如果只是了解是什么样的,很简单,一个公式:

newlr=etamin+0.5×(initiallretamin)×(1+cos(epochTmaxπ))new_{lr} = eta_{min} + 0.5 \times (initial_{lr} - eta_{min}) \times (1 + cos(\frac{epoch}{T_{max}}\pi))

newlrnew_{lr}表示新的学习率,initiallrinitial_{lr}表示初始学习率,etamineta_{min}表示最小学习率,TmaxT_{max}表示 coscos 周期的1/21/2

image.png

CosineAnnealingWarmRestarts

公式和CosineAnnealingLR是一样的,但是会在学习率达到最低点时restart,而且每次 restart 时会倍增周期。

import torch
from torchvision.models import AlexNet
from torch.optim.lr_scheduler import CosineAnnealingLR, CosineAnnealingWarmRestarts
import matplotlib.pyplot as plt
from lrs_scheduler import WarmRestart, warm_restart

model = AlexNet(num_classes=2)
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
# scheduler = CosineAnnealingLR(optimizer,T_max=20)
scheduler = CosineAnnealingWarmRestarts(optimizer, T_0=10, T_mult=2)
# scheduler = WarmRestart(optimizer)
plt.figure()
x = list(range(100))
y = []
for epoch in range(1, 101):
    optimizer.zero_grad()
    optimizer.step()
    print("第%d个epoch的学习率:%f" % (epoch, optimizer.param_groups[0]['lr']))
    scheduler.step()
    # warm_restart(scheduler)
    y.append(scheduler.get_lr()[0])

# 画出lr的变化
plt.plot(x, y)
plt.xlabel("epoch")
plt.ylabel("lr")
plt.title("learning rate's curve changes as epoch goes on!")
plt.show()

image.png