Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

Eftekhar, Ainaz; Sax, Alexander; Bachmann, Roman; Malik, Jitendra; Zamir, Amir

Computer Science > Computer Vision and Pattern Recognition

arXiv:2110.04994 (cs)

[Submitted on 11 Oct 2021]

Title:Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

Authors:Ainaz Eftekhar, Alexander Sax, Roman Bachmann, Jitendra Malik, Amir Zamir

View PDF

Abstract:This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world. Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information. In addition to enabling interesting lines of research, we show the tooling and generated data suffice to train robust vision models.
Common architectures trained on a generated starter dataset reached state-of-the-art performance on multiple common vision tasks and benchmarks, despite having seen no benchmark or non-pipeline data. The depth estimation network outperforms MiDaS and the surface normal estimation network is the first to achieve human-level performance for in-the-wild surface normal estimation -- at least according to one metric on the OASIS benchmark.
The Dockerized pipeline with CLI, the (mostly python) code, PyTorch dataloaders for the generated data, the generated starter dataset, download scripts and other utilities are available through our project website, this https URL.

Comments:	ICCV 2021: See project website this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Robotics (cs.RO)
Cite as:	arXiv:2110.04994 [cs.CV]
	(or arXiv:2110.04994v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2110.04994

Submission history

From: Alexander Sax [view email]
[v1] Mon, 11 Oct 2021 04:21:46 UTC (33,600 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-10

Change to browse by:

cs
cs.AI
cs.GR
cs.RO

References & Citations

DBLP - CS Bibliography

listing | bibtex

Alexander Sax
Jitendra Malik
Amir Roshan Zamir

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators