Zhaoxi Chen

News

[09/2025] Two papers accepted to NeurIPS 2025 (PhysX-3D as Spotlight presentation).

[08/2025] CityDreamer4D is accepted to IEEE TPAMI.

[08/2025] OnePoseViaGen accepted to CoRL 2025 as Oral presentation.

[06/2025] Three papers accepted to ICCV 2025 (Free4D, DCM, and TACA).

[06/2025] We are organizing CVPR 2025 Tutorial on From Video Generation to World Models.

[06/2025] Selected for CVPR 2025 Doctoral Consortium!

[04/2025] Recognized as China3DV 2025 Rising Star Honorable Mention Award!

[04/2025] Invited Talk at Oxford VGG on "Feedforward Primitive-based 3D Generation".

[03/2025] Oral presentation at 3DV 2025 Nectar Track Spotlights for LGM and 3DTopia-XL.

[02/2025] Awarded Meshy PhD Fellowship 2025!

[02/2025] Two papers accepted to CVPR 2025 (3DTopia-XL as Highlight and GaussianCity).

[01/2025] Invited Talk at AIR Tsinghua on "Towards General Neural 3D Asset Generation".

[11/2024] Guest Lecture at University of Michigan.

[10/2024] Invited Talk at WiseModel on "Native 3D Generative Models".

[08/2024] Invited Talk at Stanford SVL on "From 3D Generative Models to Dynamic Embodied Learning".

[07/2024] Invited Talk at GAMES Webinar on "Unbounded 3D Scene Generation".

[07/2024] One paper accepted to ECCV 2024 as Oral presentation.

[04/2024] One paper accepted to IEEE TPAMI.

[03/2024] Two papers accepted to CVPR 2024 with URHand as Oral presentation.

[01/2024] Invited Talk at ReadPaper on Primitive Diffusion.

[01/2024] One paper accepted to IJCV.

[09/2023] One paper accepted to NeurIPS 2023.

[09/2023] Invited Talk at DeepBlue AI on Unbounded 3D Scene Generation.

[09/2023] One paper accepted to IEEE TPAMI.

[07/2023] Two papers accepted to ICCV 2023.

[04/2023] Selected as 2023 Meta Research PhD Fellowship Finalist!

[02/2023] One paper accepted to CVPR 2023 as Highlight.

[01/2023] One paper accepted to ICLR 2023 as Spotlight.

[08/2022] One paper accepted to TOG (Proc. SIGGRAPH Asia 2022).

[07/2022] One paper accepted to ECCV 2022.

[08/2021] Join MMLab@NTU!

[05/2021] Awarded AISG PhD Fellowship!

[07/2021] One paper accepted to ICCV 2021 for Oral presentation.

Awards

Publications

Neural Rendering & Generative Models

* denotes equal contributions

arXiv - 2025

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Zhaoxi Chen*, Tianqi Liu*, Long Zhuo*, Jiawei Ren, Zeng Tao, He Zhu, Fangzhou Hong, Liang Pan, Ziwei Liu

Project Page Paper Dataset Code

4DNeX is a feed-forward framework that generates 4D (dynamic 3D) scene representations from a single image by adapting a video diffusion model. It produces high-quality dynamic point clouds and enables novel-view video synthesis.

SPOTLIGHT NeurIPS - 2025

PhysX-3D: Physical-Grounded 3D Asset Generation

Ziang Cao, Zhaoxi Chen, Liang Pan, Ziwei Liu*

Project Page Paper Dataset Code

PhysX-3D, an end-to-end paradigm for physical-grounded 3D asset generation. To bridge the critical gap in physics-annotated 3D datasets, we present PhysXNet - the first physics-grounded 3D dataset systematically annotated across five foundational dimensions: absolute scale, material, affordance, kinematics, and function description.

TPAMI - 2025

CityDreamer4D: 4D Generative Modeling for Unbounded 3D City Generation

Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu

Project Page Paper Code

CityDreamer4D is a framework for unbounded 4D city generation that decouples static and dynamic scenes, achieving superior realism, multi-view consistency, and diverse styles.

ORAL CoRL - 2025

One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation

Zheng Geng, Nan Wang, Shaocong Xu, Chongjie Ye, Bohan Li, Zhaoxi Chen, Sida Peng, Hao Zhao

Project Page Paper Code

OnePoseviaGen is a framework for one-shot 6D pose estimation from a single image with generative domain randomization.

ICCV - 2025

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Tianqi Liu*, Zihao Huang*, Zhaoxi Chen, Guangcong Wang, Shoukang Hu, Liao Shen, Huiqiang Sun, Zhiguo Cao, Wei Li, Ziwei Liu

Project Page Paper Code

Free4D is a tuning-free framework for 4D scene generation from a single image or text.

ICCV - 2025

DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

Zhengyao Lv*, Chenyang Si*, Tianlin Pan, Zhaoxi Chen, Kwan-Yee K. Wong, Yu Qiao, Ziwei Liu

Project Page Paper Code

A parameter-efficient Dual-Expert Consistency Model (DCM), where a semantic expert focuses on learning semantic layout and motion, while a detail expert specializes in fine detail refinement for efficient and high-quality video generation.

ICCV - 2025

TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Zhengyao Lv*, Tianlin Pan*, Chenyang Si*, Zhaoxi Chen, Wangmeng Zuo, Ziwei Liu, Kwan-Yee K. Wong

Project Page Paper Code

TACA is a parameter-efficient method that dynamically rebalances multimodal interactions through temperature scaling and timestep-dependent adjustment, achieving superior text-image alignment.

HIGHLIGHT CVPR - 2025

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

Zhaoxi Chen, Jiaxiang Tang, Yuhao Dong, Ziang Cao, Fangzhou Hong, Yushi Lan, Tengfei Wang, Haozhe Xie, Tong Wu, Shunsuke Saito, Liang Pan, Dahua Lin, Ziwei Liu

Project Page Paper Demo Code

3DTopia-XL is a 3D diffusion transformer (DiT) operating on primitive-based representation. It can generate 3D asset with smooth geometry and PBR materials from single image or text.

CVPR - 2025

Generative Gaussian Splatting for Unbounded 3D City Generation

Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu

Project Page Paper Video Code

GaussianCity is a framework for efficient unbounded 3D city generation using 3D Gaussian Splatting.

ORAL ECCV - 2024

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu

Project Page Paper Demo Code

LGM enables high-resolution 3D content creation from images or texts via Large Gaussian Model.

ORAL CVPR - 2024

URHand: Universal Relightable Hands

Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu*, Shunsuke Saito*

Project Page Paper Video Code

URHand (a.k.a Your Hand). Our model is a high-fidelity Universal prior for Relightable Hands built upon light-stage data. It generalizes to novel viewpoints, poses, identities, and illuminations, which enables quick personalization from a phone scan. We propose the spatially varying linear lighting model for scalable relighting training.

CVPR - 2024

CityDreamer: Compositional Generative Model of Unbounded 3D Cities

Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu

Project Page Paper Video Code

CityDreamer learns to generate unbounded 3D cities from images (Google Earth imagery and OpenStreetMap).

TPAMI - 2024

PERF: Panoramic Neural Radiance Field from a Single Panorama

Guangcong Wang*, Peng Wang*, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu

Project Page Paper Video Code

PERF is a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama, enabling 3D roaming in a complex scene without expensive and tedious image collection.

IJCV - 2024

ReliTalk: Relightable Talking Portrait Generation from a Single Video

Haonan Qiu, Zhaoxi Chen, Yuming Jiang, Hang Zhou, Xiangyu Fan, Lei Yang, Wayne Wu, Ziwei Liu

Project Page Paper Video Code

Relight and drive a portrait with a single video clip!

NeurIPS - 2023

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation

Zhaoxi Chen, Fangzhou Hong, Haiyi Mei, Guangcong Wang, Lei Yang, Ziwei Liu

Project Page Paper Video Code

PrimDiffusion performs the diffusion and denoising process on a set of primitives which compactly represent 3D humans. This generative modeling has explicit pose, view, and shape control, with the capability of modeling off-body topology in well-defined depth. It enables downstream tasks like 3D texture transfer and inpainting.

TPAMI - 2023

SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections

Zhaoxi Chen, Guangcong Wang, Ziwei Liu

Project Page Paper Video Code

SceneDreamer learns to generate unbounded 3D scenes from in-the-wild 2D image collections. Our method can synthesize diverse landscapes across different styles, with 3D consistency, well-defined depth, and free camera trajectory.

ICCV - 2023

SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis

Guangcong Wang, Zhaoxi Chen, Chen Change Loy, Ziwei Liu

Project Page Paper Video Code

SparseNeRF is a simple yet effective few-shot NeRF by distilling robust local depth ranking priors from real-world inaccurate depth observations.

ICCV - 2023

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling

Zhitao Yang*, Zhongang Cai*, Haiyi Mei*, Shuai Liu*, Zhaoxi Chen*, Weiye Xiao, Yukun Wei, Zhongfei Qing, Chen Wei, Bo Dai, Wayne Wu, Chen Qian, Dahua Lin, Ziwei Liu, Lei Yang

Project Page Paper Video

SynBody is a large-scale synthetic dataset with massive number of subjects and high-quality annotations. It supports various research topics, including 3D human perception, reconstruction and generation.

HIGHLIGHT CVPR - 2023

F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

Peng Wang*, Yuan Liu*, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Project Page Paper Code

F2-NeRF enables arbitrary input camera trajectories for novel view synthesis and only costs a few minutes for training.

SPOTLIGHT ICLR - 2023

EVA3D: Compositional 3D Human Generation from 2D Image Collections

Fangzhou Hong, Zhaoxi Chen, Yushi Lan, Liang Pan, Ziwei Liu

Project Page Paper Video Code

EVA3D is a high-quality unconditional 3D human generative model that only requires 2D image collections for training.

SIGGRAPH Asia (TOG) - 2022

Text2Light: Zero-Shot Text-Driven HDR Panorama Generation

Zhaoxi Chen, Guangcong Wang, Ziwei Liu

Project Page Paper Video Code

Text2Light can generate HDR panoramas in 4K+ resolution using free-form texts, without training on text-image pairs. The high-quality generated HDR panoramas can be directly applied to downstream tasks, e.g., light 3D scenes and immersive virtual reality.

ECCV - 2022

Relighting4D: Neural Relightable Human from Videos

Zhaoxi Chen, Ziwei Liu

Project Page Paper Code

Relighting4D takes only videos as input, decomposing them into geometry and reflectance in a self-supervised manner, which enables relighting of dynamic humans with free viewpoints by a physically based renderer.

ORAL ICCV - 2021

Adaptive Focus for Efficient Video Recognition

Yulin Wang*, Zhaoxi Chen*, Haojun Jiang, Shiji Song, Yizeng Han, Gao Huang

Paper Code

In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency. Extensive experiments on five benchmark datasets, i.e., ActivityNet, FCVID, Mini-Kinetics, Something-Something V1&V2, demonstrate that our method is significantly more efficient than the competitive baselines.

AISTATS - 2021

Understanding Robustness in Teacher-Student Setting: A New Perspective

Zhuolin Yang*, Zhaoxi Chen, Tiffany Cai, Xinyun Chen, Bo Li, Yuandong Tian*

Paper

In the case of low-rank input data, we show that student specialization still happens within the input subspace, but the teacher and student nodes could differ wildly out of the data subspace, which we conjecture leads to adversarial examples.