Details
Original language | English |
---|---|
Pages (from-to) | 6306-6325 |
Number of pages | 20 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 46 |
Issue number | 9 |
Early online date | 19 Mar 2024 |
Publication status | Published - Sept 2024 |
Abstract
Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
Keywords
- cuboid fitting, Estimation, Image reconstruction, minimal solver, multi-model fitting, Scene abstraction, Shape, shape decomposition, Solid modeling, Surface reconstruction, Three-dimensional displays, Training
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Computer Vision and Pattern Recognition
- Computer Science(all)
- Computational Theory and Mathematics
- Computer Science(all)
- Artificial Intelligence
- Mathematics(all)
- Applied Mathematics
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 46, No. 9, 09.2024, p. 6306-6325.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Robust Shape Fitting for 3D Scene Abstraction
AU - Kluger, Florian
AU - Brachmann, Eric
AU - Yang, Michael Ying
AU - Rosenhahn, Bodo
N1 - Publisher Copyright: © 1979-2012 IEEE.
PY - 2024/9
Y1 - 2024/9
N2 - Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
AB - Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
KW - cuboid fitting
KW - Estimation
KW - Image reconstruction
KW - minimal solver
KW - multi-model fitting
KW - Scene abstraction
KW - Shape
KW - shape decomposition
KW - Solid modeling
KW - Surface reconstruction
KW - Three-dimensional displays
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85188527432&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2403.10452
DO - 10.48550/arXiv.2403.10452
M3 - Article
AN - SCOPUS:85188527432
VL - 46
SP - 6306
EP - 6325
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
SN - 0162-8828
IS - 9
ER -