EpiGRAF: Rethinking training of 3D GANs

Ivan Skorokhodov, Sergey Tulyakov, Yiqun Wang, Peter Wonka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Scopus citations

Abstract

A recent trend in generative modeling is building 3D-aware generators from 2D image collections. To induce the 3D bias, such models typically rely on volumetric rendering, which is expensive to employ at high resolutions. Over the past months, more than ten works have addressed this scaling issue by training a separate 2D decoder to upsample a low-resolution image (or a feature tensor) produced from a pure 3D generator. But this solution comes at a cost: not only does it break multi-view consistency (i.e., shape and texture change when the camera moves), but it also learns geometry in low fidelity. In this work, we show that obtaining a high-resolution 3D generator with SotA image quality is possible by following a completely different route of simply training the model patch-wise. We revisit and improve this optimization scheme in two ways. First, we design a location- and scale-aware discriminator to work on patches of different proportions and spatial positions. Second, we modify the patch sampling strategy based on an annealed beta distribution to stabilize training and accelerate the convergence. The resulting model, named EpiGRAF, is an efficient, high-resolution, pure 3D generator, and we test it on four datasets (two introduced in this work) at 2562 and 5122 resolutions. It obtains state-of-the-art image quality, high-fidelity geometry and trains ≈2.5× faster than the upsampler-based counterparts.
Original languageEnglish (US)
Title of host publication36th Conference on Neural Information Processing Systems, NeurIPS 2022
PublisherNeural information processing systems foundation
ISBN (Print)9781713871088
StatePublished - Jan 1 2022

Fingerprint

Dive into the research topics of 'EpiGRAF: Rethinking training of 3D GANs'. Together they form a unique fingerprint.

Cite this