TY - GEN
T1 - How Does Lipschitz Regularization Influence GAN Training?
AU - Qin, Yipeng
AU - Mitra, Niloy
AU - Wonka, Peter
N1 - KAUST Repository Item: Exported on 2020-11-05
Acknowledged KAUST grant number(s): CRG2018-3730
Acknowledgements: This work was supported in part by the KAUST Office of Sponsored Research (OSR) under Award No. OSR-CRG2018-3730.
PY - 2020/10/9
Y1 - 2020/10/9
N2 - Despite the success of Lipschitz regularization in stabilizing GAN training, the exact reason of its effectiveness remains poorly understood. The direct effect of K-Lipschitz regularization is to restrict the L2-norm of the neural network gradient to be smaller than a threshold K (e.g.,) such that. In this work, we uncover an even more important effect of Lipschitz regularization by examining its impact on the loss function: It degenerates GAN loss functions to almost linear ones by restricting their domain and interval of attainable gradient values. Our analysis shows that loss functions are only successful if they are degenerated to almost linear ones. We also show that loss functions perform poorly if they are not degenerated and that a wide range of functions can be used as loss function as long as they are sufficiently degenerated by regularization. Basically, Lipschitz regularization ensures that all loss functions effectively work in the same way. Empirically, we verify our proposition on the MNIST, CIFAR10 and CelebA datasets.
AB - Despite the success of Lipschitz regularization in stabilizing GAN training, the exact reason of its effectiveness remains poorly understood. The direct effect of K-Lipschitz regularization is to restrict the L2-norm of the neural network gradient to be smaller than a threshold K (e.g.,) such that. In this work, we uncover an even more important effect of Lipschitz regularization by examining its impact on the loss function: It degenerates GAN loss functions to almost linear ones by restricting their domain and interval of attainable gradient values. Our analysis shows that loss functions are only successful if they are degenerated to almost linear ones. We also show that loss functions perform poorly if they are not degenerated and that a wide range of functions can be used as loss function as long as they are sufficiently degenerated by regularization. Basically, Lipschitz regularization ensures that all loss functions effectively work in the same way. Empirically, we verify our proposition on the MNIST, CIFAR10 and CelebA datasets.
UR - http://hdl.handle.net/10754/665819
UR - http://link.springer.com/10.1007/978-3-030-58517-4_19
UR - http://www.scopus.com/inward/record.url?scp=85092937698&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-58517-4_19
DO - 10.1007/978-3-030-58517-4_19
M3 - Conference contribution
SN - 9783030585167
SP - 310
EP - 326
BT - Computer Vision – ECCV 2020
PB - Springer International Publishing
ER -