PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation
NeurIPS 2023

Zhaoxi Chen¹
Fangzhou Hong¹
Haiyi Mei²
Guangcong Wang¹
Lei Yang²
Ziwei Liu¹

Abstract

PrimDiffusion performs the diffusion and denoising process on a set of primitives which compactly represent 3D humans. This generative modeling enables explicit pose, view, and shape control, with the capability of modeling off-body topology in well-defined depth. Moreover, our method can generalize to novel poses without post-processing and enable downstream human-centric tasks like 3D texture transfer.

Framework

We represent the 3D human as K primitives learned from multi-view images. Each primitive V_k has independent kinematic parameters {T_k, R_k, s_k} (translation, rotation, and per-axis scales, respectively) and radiance parameters {c_k, σ_k} (color and density). For each time step t, we diffuse the primitives V₀ with noise ϵ sampled according to a fixed noise schedule. The resulting V_t is fed to g_Φ(·) which learns to predict the denoised volumetric primitives.

scales

To get rid of per-subject optimization, we propose an encoder-only network that is capable of learning primitives from multi-view images across identities. The encoder consists of a motion branch and an appearance branch, which are fused by the proposed cross-modal attention layer to get kinematic and radiance information of primitives.

scales

Visualization of the Denoising Process

We visualize the denoising process of primitives and corresponding 360-degree novel views.

Video

Citation

@inproceedings{chen2023primdiffusion,
    title={PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation},
    author={Zhaoxi Chen and Fangzhou Hong and Haiyi Mei and Guangcong Wang and Lei Yang and Ziwei Liu},
    booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
    year={2023}
}

Acknowledgements

PrimDiffusion is implemented on top of the DVA and Latent-Diffusion codebase. The training data are rendered via XRFeitoria toolchain.
The website template is borrowed from Mip-NeRF.

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation
NeurIPS 2023

Paper

Video

Code

Abstract

Framework

Visualization of the Denoising Process

Video

Citation

Acknowledgements

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation NeurIPS 2023

Paper

Video

Code

Abstract

Framework

Visualization of the Denoising Process

Video

Citation

Acknowledgements

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation
NeurIPS 2023