Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)499-516
Seitenumfang18
FachzeitschriftPFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science
Jahrgang92
Ausgabenummer5
Frühes Online-Datum16 Sept. 2024
PublikationsstatusVeröffentlicht - Okt. 2024

Abstract

Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.

ASJC Scopus Sachgebiete

Zitieren

Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN. / El Amrani Abouelassad, S.; Mehltretter, M.; Rottensteiner, F.
in: PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science, Jahrgang 92, Nr. 5, 10.2024, S. 499-516.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

El Amrani Abouelassad, S, Mehltretter, M & Rottensteiner, F 2024, 'Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN', PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science, Jg. 92, Nr. 5, S. 499-516. https://doi.org/10.1007/s41064-024-00311-0
El Amrani Abouelassad, S., Mehltretter, M., & Rottensteiner, F. (2024). Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN. PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science, 92(5), 499-516. https://doi.org/10.1007/s41064-024-00311-0
El Amrani Abouelassad S, Mehltretter M, Rottensteiner F. Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN. PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science. 2024 Okt;92(5):499-516. Epub 2024 Sep 16. doi: 10.1007/s41064-024-00311-0
El Amrani Abouelassad, S. ; Mehltretter, M. ; Rottensteiner, F. / Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN. in: PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science. 2024 ; Jahrgang 92, Nr. 5. S. 499-516.
Download
@article{7fd7714d18b148b38200326cc8a3c536,
title = "Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN",
abstract = "Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.",
keywords = "Autonomous driving, Object detection, Object reconstruction, Pose estimation, Shape estimation",
author = "{El Amrani Abouelassad}, S. and M. Mehltretter and F. Rottensteiner",
note = "Publisher Copyright: {\textcopyright} The Author(s) 2024.",
year = "2024",
month = oct,
doi = "10.1007/s41064-024-00311-0",
language = "English",
volume = "92",
pages = "499--516",
number = "5",

}

Download

TY - JOUR

T1 - Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN

AU - El Amrani Abouelassad, S.

AU - Mehltretter, M.

AU - Rottensteiner, F.

N1 - Publisher Copyright: © The Author(s) 2024.

PY - 2024/10

Y1 - 2024/10

N2 - Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.

AB - Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.

KW - Autonomous driving

KW - Object detection

KW - Object reconstruction

KW - Pose estimation

KW - Shape estimation

UR - http://www.scopus.com/inward/record.url?scp=85204012518&partnerID=8YFLogxK

U2 - 10.1007/s41064-024-00311-0

DO - 10.1007/s41064-024-00311-0

M3 - Article

AN - SCOPUS:85204012518

VL - 92

SP - 499

EP - 516

JO - PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science

JF - PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science

SN - 2512-2789

IS - 5

ER -

Von denselben Autoren