1. 如果我们将权重初始化为零,会发生什么。算法仍然有效吗?
# w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)
w = torch.zeros(size=(2,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)
epoch 1, loss 0.063419
epoch 2, loss 0.000330
epoch 3, loss 0.000051
epoch 4, loss 0.000049
epoch 5, loss 0.000049
epoch 6, loss 0.000049
epoch 7, loss 0.000049
epoch 8, loss 0.000049
epoch 9, loss 0.000049
epoch 10, loss 0.000049
print(f'w的估计误差: {true_w - w.reshape(true_w.shape)}')
print(f'b的估计误差: {true_b - b}')
w的估计误差: tensor([-0.0007, 0.0002], grad_fn=<SubBackward0>)
b的估计误差: tensor([0.0001], grad_fn=<RsubBackward1>)
从结果来看,仍然有效。
6. 尝试使用不同的学习率,观察损失函数值下降的快慢。
learning_rate = 0.03
lr = 0.03
num_epochs = 10
net = linreg
loss = squared_loss
epoch 1, loss 2.273896
epoch 2, loss 0.328764
epoch 3, loss 0.047866
epoch 4, loss 0.007026
epoch 5, loss 0.001069
epoch 6, loss 0.000199
epoch 7, loss 0.000072
epoch 8, loss 0.000053
epoch 9, loss 0.000050
epoch 10, loss 0.000050
w的估计误差: tensor([ 0.0006, -0.0003], grad_fn=<SubBackward0>)
b的估计误差: tensor([3.3855e-05], grad_fn=<RsubBackward1>)
learning_rate = 0.003
lr = 0.003
num_epochs = 100
net = linreg
loss = squared_loss
epoch 1, loss 14.191692
epoch 2, loss 11.482763
epoch 3, loss 9.291309
epoch 4, loss 7.518343
epoch 5, loss 6.083946
epoch 6, loss 4.923447
epoch 7, loss 3.984452
epoch 8, loss 3.224700
epoch 9, loss 2.609887
epoch 10, loss 2.112417
epoch 11, loss 1.709823
epoch 12, loss 1.384031
epoch 13, loss 1.120365
epoch 14, loss 0.906977
epoch 15, loss 0.734251
epoch 16, loss 0.594457
epoch 17, loss 0.481308
epoch 18, loss 0.389713
epoch 19, loss 0.315562
epoch 20, loss 0.255540
epoch 21, loss 0.206945
epoch 22, loss 0.167602
epoch 23, loss 0.135748
epoch 24, loss 0.109956
epoch 25, loss 0.089070
epoch 26, loss 0.072159
epoch 27, loss 0.058462
epoch 28, loss 0.047371
epoch 29, loss 0.038388
epoch 30, loss 0.031112
epoch 31, loss 0.025219
epoch 32, loss 0.020444
epoch 33, loss 0.016577
epoch 34, loss 0.013444
epoch 35, loss 0.010906
epoch 36, loss 0.008848
epoch 37, loss 0.007182
epoch 38, loss 0.005831
epoch 39, loss 0.004737
epoch 40, loss 0.003850
epoch 41, loss 0.003131
epoch 42, loss 0.002549
epoch 43, loss 0.002076
epoch 44, loss 0.001694
epoch 45, loss 0.001383
epoch 46, loss 0.001131
epoch 47, loss 0.000927
epoch 48, loss 0.000762
epoch 49, loss 0.000628
epoch 50, loss 0.000519
epoch 51, loss 0.000431
epoch 52, loss 0.000359
epoch 53, loss 0.000301
epoch 54, loss 0.000254
epoch 55, loss 0.000216
epoch 56, loss 0.000185
epoch 57, loss 0.000159
epoch 58, loss 0.000139
epoch 59, loss 0.000122
epoch 60, loss 0.000109
epoch 61, loss 0.000098
epoch 62, loss 0.000089
epoch 63, loss 0.000082
epoch 64, loss 0.000076
epoch 65, loss 0.000071
epoch 66, loss 0.000068
epoch 67, loss 0.000064
epoch 68, loss 0.000062
epoch 69, loss 0.000060
epoch 70, loss 0.000058
epoch 71, loss 0.000057
epoch 72, loss 0.000056
epoch 73, loss 0.000055
epoch 74, loss 0.000054
epoch 75, loss 0.000053
epoch 76, loss 0.000053
epoch 77, loss 0.000053
epoch 78, loss 0.000052
epoch 79, loss 0.000052
epoch 80, loss 0.000052
epoch 81, loss 0.000052
epoch 82, loss 0.000052
epoch 83, loss 0.000051
...
epoch 100, loss 0.000051
w的估计误差: tensor([-3.1710e-05, 9.0599e-06], grad_fn=<SubBackward0>)
b的估计误差: tensor([-8.9169e-05], grad_fn=<RsubBackward1>)
| learning_rate | epochs |
|---|---|
| 0.03 | 9 |
| 0.003 | 83 |