The Ultimate Guide to 3D Model and Scene Generation Papers (Feb 2023)
This article is written in collaboration with Matt White. The cherished generative A.I. expert, researcher, and educator of Berkeley Synthetic and UC Berkeley.
Following the interest to my January article, The Ultimate Glossary of 3D Asset and Scene Generation Models (Jan. 2023), Matt and I have decided to publish this article as a follow up. In there you’ll find the usual suspects like GET 3D and Point-E, but you’ll also be surprised with new additions like Adan, Atlas, ECON, and Score Jacobian Chaining that were not in the former list.
Significant advancements have recently been realized in generative A.I., making its applications highly practical and producing a lot of media buzz. In the generative text, we have seen the power of ChatGPT (a fine-tuned model built on GPT-3) and generative images with platforms and models like DALL-E 2, Midjourney, and Stable Diffusion. The 3D asset and scene generation area has been slower to develop because 3D generation presents significant challenges due to its multidimensional nature.
As Matt explained to me earlier, we can use only a few publicly available 3D shape datasets without special licensing or copyright infringement. The available samples need to be more diverse and represent many different styles, as we see in the variety of 2D images that are used to train generative image models.
3D reconstruction from 2D images is computationally expensive. NeRF has become the de facto algorithm for volumetric and material property representation (volume is the 3D shape itself, and texture is applied to the 3D model to provide its aesthetic properties.) Improvements to NeRF, like NVIDIA’s Instant-NGP, have shown substantial performance improvements, but 3D construction can still take hours, if not days, to render complex 3D objects and scenes in high resolution.
With the rapid pace of innovation in the space of generative 3D, we expect to see breakthroughs and the emergence of new commercially viable platforms hitting the market in 2023.
Papers with Code
Get3D by NVIDIA: A Generative Model of High Quality 3D Textured Shapes Learned from Images
Point-E by OPEN AI
Clip Mesh
Score Jacobian Chaining
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NVIDIA: Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
Diffusion Probabilistic Models for 3D Point Cloud Generation
InfiniteNature-Zero
Atlas. End-to-End 3D Scene Reconstruction from Posed Images
Stable-Dreamfusion
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Online Real-Time Volumetric NeRF+SLAM
GeoCode: Interpretable Shape Programs
ECON: Explicit Clothed humans Obtained from Normals
NeROIC: Neural Object Capture and Rendering from Online Image Collections
Papers only
Dream Fields
Dream Fusion
Novel View Synthesis with Diffusion Models
Portrait Neural Radiance Fields from a Single Image
InstantAvatar: Avatars in 60 Seconds
Rodin: 3D Avatars Using Diffusion:
Human Diffusion Motion
3D HumanGAN
Climate Nerf
Generating Holistic 3D Human Motion from Speech
RANA: Neural Avatar by NVIDIA
One-shot Implicit Animatable Avatars with Model-based Priors
3D Designer: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
Spin-Nerf: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
Magic3D: High-Resolution Text-to-3D Content Creation
https://deepimagination.cc/Magic3D
This link is not in the same format as other papers because somehow Medium does not allow the same treatment for this link.