We introduce Uncommon Objects in 3D, a new object-centric dataset for 3D deep learning and 3D generative AI. uCO3D is the largest publicly-available collection of high-resolution videos of objects with 3D annotations that ensures full-360 degree coverage. uCO3D is significantly more diverse than MVImgNet and CO3Dv2, covering more than 1,000 object categories. It is also of higher quality, due to extensive quality checks of both the collected videos and the 3D annotations. Similar to analogous datasets, uCO3D contains annotations for 3D camera poses, depth maps and sparse point clouds. In addition, each object is equipped with a caption and a 3D Gaussian Splat reconstruction. We train several large 3D models on MVImgNet, CO3Dv2, and uCO3D and obtain superior results using the latter, showing that uCO3D is better for learning applications.
Each scene in uCO3D is reconstructed using 3D Gaussian Splatting. Below we provide an interactive viewer of selected reconstructions. Furthermore, a scene caption, generated by a VLM is present below each reconstruction.
Each scene is also reconstructed with VGGSfM yielding a sparse point cloud with per-frame camera annotations, and a dense point cloud.
Collected videosuCO3D can be used to train text-to-3D GenAI models. Below, we show samples from the first Instant3D-like model trained on real-life data.
@inproceedings{liu24uco3d, Author = {Liu, Xingchen and Tayal, Piyush and Wang, Jianyuan and Zarzar, Jesus and Monnier, Tom and Tertikas, Konstantinos and Duan, Jiali and Toisoul, Antoine and Zhang, Jason Y. and Neverova, Natalia and Vedaldi, Andrea and Shapovalov, Roman and Novotny, David}, Booktitle = {arXiv}, Title = {UnCommon Objects in 3D}, Year = {2024}, }
The website template was borrowed from Michael Gharbi, Ref-NeRF, ReconFusion, and CAT3D.