PyTorch: Initialize Parameters

We can set the parameters in a for loop. We take some of the initialization methods from Lippe¹.

To set based on the input dimension of the layer ( [[Initialize Artificial Neural Networks]] Initialize Artificial Neural Networks Initialize a neural network is important for the training and performance. Some initializations simply don't work, some will degrade the performance of the model. We should choose wisely. ) (normalized initialization),

for name, param in model.named_parameters():
    if name.endswith(".bias"):
        param.data.fill_(0)
    else:
        bound = math.sqrt(6)/math.sqrt(param.shape[0]+param.shape[1])
        param.data.uniform_(-bound, bound)

or set the parameters based on the input size of each layer

for name, param in model.named_parameters():
        if name.endswith(".bias"):
            param.data.fill_(0)
        else:
            param.data.normal_(std=1.0/math.sqrt(param.shape[1]))

or set the params to some fixed normal distribution

some_std = 0.1
for name, param in model.named_prameters():
    param.data.normal_(std=some_std)

or be constant if we really want,

some_value = 0.1
for name, param in model.named_prameters():
    param.data.fill_(some_value)

For different activation functions, the factor for $1/D$, with $D$ being the dimension of the input, can be different. Use torch.nn.init.calculate_gain.

Lippe Lippe P. Tutorial 3: Activation Functions — UvA DL Notebooks v1.1 documentation. In: UvA Deep Learning Tutorials [Internet]. [cited 23 Sep 2021]. Available: https://uvadlc-notebooks.readthedocs.io ↩︎

Planted: 2021-01-01 by L Ma;

References:

Lippe Lippe P. Tutorial 3: Activation Functions — UvA DL Notebooks v1.1 documentation. In: UvA Deep Learning Tutorials [Internet]. [cited 23 Sep 2021]. Available: https://uvadlc-notebooks.readthedocs.io