PyTorch: Initialize Parameters

#Python #PyTorch

We can set the parameters in a for loop. We take some of the initialization methods from Lippe1.

To set based on the input dimension of the layer ( Initialize Artificial Neural Networks Initialize Artificial Neural Networks Initialize a neural network is important for the training and performance. Some initializations simply don't work, some will degrade the performance of the model. We should choose wisely. ) (normalized initialization),

for name, param in model.named_parameters():
    if name.endswith(".bias"):
        param.data.fill_(0)
    else:
        bound = math.sqrt(6)/math.sqrt(param.shape[0]+param.shape[1])
        param.data.uniform_(-bound, bound)

or set the parameters based on the input size of each layer

for name, param in model.named_parameters():
        if name.endswith(".bias"):
            param.data.fill_(0)
        else:
            param.data.normal_(std=1.0/math.sqrt(param.shape[1]))

or set the params to some fixed normal distribution

some_std = 0.1
for name, param in model.named_prameters():
    param.data.normal_(std=some_std)

or be constant if we really want,

some_value = 0.1
for name, param in model.named_prameters():
    param.data.fill_(some_value)

For different activation functions, the factor for $1/D$, with $D$ being the dimension of the input, can be different. Use torch.nn.init.calculate_gain.


  1. Lippe Lippe P. Tutorial 3: Activation Functions — UvA DL Notebooks v1.1 documentation. In: UvA Deep Learning Tutorials [Internet]. [cited 23 Sep 2021]. Available: https://uvadlc-notebooks.readthedocs.io  ↩︎

Published: by ;

Lei Ma (2021). 'PyTorch: Initialize Parameters', Datumorphism, 01 April. Available at: https://datumorphism.leima.is/til/machine-learning/pytorch/pytorch-initial-params/.

Current Ref:

  • til/machine-learning/pytorch/pytorch-initial-params.md