Robust Shape Fitting for 3D Scene Abstraction

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

Externe Organisationen

  • Niantic
  • University of Bath
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)6306-6325
Seitenumfang20
FachzeitschriftIEEE Transactions on Pattern Analysis and Machine Intelligence
Jahrgang46
Ausgabenummer9
Frühes Online-Datum19 März 2024
PublikationsstatusVeröffentlicht - Sept. 2024

Abstract

Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.

ASJC Scopus Sachgebiete

Zitieren

Robust Shape Fitting for 3D Scene Abstraction. / Kluger, Florian; Brachmann, Eric; Yang, Michael Ying et al.
in: IEEE Transactions on Pattern Analysis and Machine Intelligence, Jahrgang 46, Nr. 9, 09.2024, S. 6306-6325.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Kluger F, Brachmann E, Yang MY, Rosenhahn B. Robust Shape Fitting for 3D Scene Abstraction. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2024 Sep;46(9):6306-6325. Epub 2024 Mär 19. doi: 10.48550/arXiv.2403.10452, 10.1109/TPAMI.2024.3379014
Kluger, Florian ; Brachmann, Eric ; Yang, Michael Ying et al. / Robust Shape Fitting for 3D Scene Abstraction. in: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2024 ; Jahrgang 46, Nr. 9. S. 6306-6325.
Download
@article{efa0adbf6c6548cda932c632436b0a2d,
title = "Robust Shape Fitting for 3D Scene Abstraction",
abstract = "Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.",
keywords = "cuboid fitting, Estimation, Image reconstruction, minimal solver, multi-model fitting, Scene abstraction, Shape, shape decomposition, Solid modeling, Surface reconstruction, Three-dimensional displays, Training",
author = "Florian Kluger and Eric Brachmann and Yang, {Michael Ying} and Bodo Rosenhahn",
note = "Publisher Copyright: {\textcopyright} 1979-2012 IEEE.",
year = "2024",
month = sep,
doi = "10.48550/arXiv.2403.10452",
language = "English",
volume = "46",
pages = "6306--6325",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",
number = "9",

}

Download

TY - JOUR

T1 - Robust Shape Fitting for 3D Scene Abstraction

AU - Kluger, Florian

AU - Brachmann, Eric

AU - Yang, Michael Ying

AU - Rosenhahn, Bodo

N1 - Publisher Copyright: © 1979-2012 IEEE.

PY - 2024/9

Y1 - 2024/9

N2 - Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.

AB - Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.

KW - cuboid fitting

KW - Estimation

KW - Image reconstruction

KW - minimal solver

KW - multi-model fitting

KW - Scene abstraction

KW - Shape

KW - shape decomposition

KW - Solid modeling

KW - Surface reconstruction

KW - Three-dimensional displays

KW - Training

UR - http://www.scopus.com/inward/record.url?scp=85188527432&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2403.10452

DO - 10.48550/arXiv.2403.10452

M3 - Article

AN - SCOPUS:85188527432

VL - 46

SP - 6306

EP - 6325

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 9

ER -

Von denselben Autoren