infoGAN

#Self-supervised Learning #Adversarial Model #GAN #Basics

In GAN, the latent space input is usually random noise, e.g., Gaussian noise. The objective of [[GAN]] GAN The task of GAN is to generate features $X$ from some noise $\xi$ and class labels $Y$, $$\xi, Y \to X.$$ Many different GANs are proposed. Vanilla GAN has a simple structure with a single discriminator and a single generator. It uses the minmax game setup. However, it is not stable to use minmax game to train a GAN model. WassersteinGAN was proposed to solve the stability problem during training1. More advanced GANs like BiGAN and ALI have more complex structures. Vanilla GAN Minmax Game … is a very generic one. It doesn’t say anything about how exactly the latent space will be used. This is not desirable in many problems. We would like to have more interpretability in the latent space. InfoGAN introduced constraints to the objective to enforce interpretability of the latent space¹.

Constraint

The constraint InfoGAN proposed is [[Mutual Information]] Mutual Information Mutual information is defined as $$ I(X;Y) = \mathbb E_{p_{XY}} \ln \frac{P_{XY}}{P_X P_Y}. $$ In the case that $X$ and $Y$ are independent variables, we have $P_{XY} = P_X P_Y$, thus $I(X;Y) = 0$. This makes sense as there would be no “mutual” information if the two variables are independent of each other. Entropy and Cross Entropy Mutual information is closely related to entropy. A simple decomposition shows that $$ I(X;Y) = H(X) - H(X\mid Y), $$ which is the reduction of … ,

$$ \underset{{\color{red}G}}{\operatorname{min}} \underset{{\color{green}D}}{\operatorname{max}} V_I ({\color{green}D}, {\color{red}G}) = V({\color{green}D}, {\color{red}G}) - \lambda I(c; {\color{red}G}(z,c)), $$

where

$c$ is the latent code,
$z$ is the random noise input,
$V({\color{green}D}, {\color{red}G})$ is the objective of GAN,
$I(c; {\color{red}G}(z,c))$ is the mutual information between the input latent code and generated data.

Using the lambda multiplier, we punish the model if the generator loses information in latent code $c$.

Training

The training steps are almost the same as [[GAN]] GAN The task of GAN is to generate features $X$ from some noise $\xi$ and class labels $Y$, $$\xi, Y \to X.$$ Many different GANs are proposed. Vanilla GAN has a simple structure with a single discriminator and a single generator. It uses the minmax game setup. However, it is not stable to use minmax game to train a GAN model. WassersteinGAN was proposed to solve the stability problem during training1. More advanced GANs like BiGAN and ALI have more complex structures. Vanilla GAN Minmax Game … but with one extra loss to be calculated in each mini-batch.

Train $\color{red}G$ using loss: $\operatorname{MSE}(v’, v)$;
Train $\color{green}D$ using loss: $\operatorname{MSE}(v’, v)$;
Apply Constraint:
1. Sample data from mini-batch;
2. Calculate loss $\lambda_{l} H(l’;l)+\lambda_c \operatorname{MSE}(c,c’)$

Code

eriklindernoren/PyTorch-GAN

Chen2016 Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1606.03657 ↩︎

Planted: 2021-08-13 by L Ma;

References: