Differential Learning Rates in PyTorch
Using different learning rates in different layers of our artificial neural network.
PyTorch offers optimizer configuration for different learning rates in different layers.
In the documentation of pytorch, we find that we can set optimizer parameters on a per-layer basis 1. The example from the documentation is
optim.SGD([
{'params': model.base.parameters()},
{'params': model.classifier.parameters(), 'lr': 1e-3}
], lr=1e-2, momentum=0.9)
Planted:
by L Ma;
References:
No backlinks identified. Reference this note using the Note ID
til/machine-learning/pytorch/pytorch-differential-learning-rates.md
in other notes to connect them.
Links to:
L Ma (2021). 'Differential Learning Rates in PyTorch', Datumorphism, 11 April. Available at: https://datumorphism.leima.is/til/machine-learning/pytorch/pytorch-differential-learning-rates/.