Recovering accurate 3D human pose in the wild using IMUs and a moving camera

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Timo von Marcard
  • Roberto Henschel
  • Michael J. Black
  • Bodo Rosenhahn
  • Gerard Pons-Moll

Externe Organisationen

  • Max-Planck-Institut für Intelligente Systeme (Stuttgart)
  • Max-Planck-Institut für Informatik
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksComputer Vision
UntertitelECCV 2018 - 15th European Conference, 2018, Proceedings
Herausgeber (Verlag)Springer Verlag
Seiten614-631
Seitenumfang18
ISBN (Print)9783030012489
PublikationsstatusVeröffentlicht - 6 Okt. 2018
Veranstaltung15th European Conference on Computer Vision, ECCV 2018 - Munich, Deutschland
Dauer: 8 Sept. 201814 Sept. 2018

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band11214 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Abstract

In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

ASJC Scopus Sachgebiete

Zitieren

Recovering accurate 3D human pose in the wild using IMUs and a moving camera. / von Marcard, Timo; Henschel, Roberto; Black, Michael J. et al.
Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag, 2018. S. 614-631 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 11214 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

von Marcard, T, Henschel, R, Black, MJ, Rosenhahn, B & Pons-Moll, G 2018, Recovering accurate 3D human pose in the wild using IMUs and a moving camera. in Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Bd. 11214 LNCS, Springer Verlag, S. 614-631, 15th European Conference on Computer Vision, ECCV 2018, Munich, Deutschland, 8 Sept. 2018. https://doi.org/10.1007/978-3-030-01249-6_37
von Marcard, T., Henschel, R., Black, M. J., Rosenhahn, B., & Pons-Moll, G. (2018). Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings (S. 614-631). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 11214 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01249-6_37
von Marcard T, Henschel R, Black MJ, Rosenhahn B, Pons-Moll G. Recovering accurate 3D human pose in the wild using IMUs and a moving camera. in Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. S. 614-631. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-01249-6_37
von Marcard, Timo ; Henschel, Roberto ; Black, Michael J. et al. / Recovering accurate 3D human pose in the wild using IMUs and a moving camera. Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag, 2018. S. 614-631 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{f2c6186d6e0640ef98f35c054eb9c6cc,
title = "Recovering accurate 3D human pose in the wild using IMUs and a moving camera",
abstract = "In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.",
keywords = "2D to 3D, 3D pose dataset, Human pose, IMUs, People tracking, Sensor fusion, Video",
author = "{von Marcard}, Timo and Roberto Henschel and Black, {Michael J.} and Bodo Rosenhahn and Gerard Pons-Moll",
year = "2018",
month = oct,
day = "6",
doi = "10.1007/978-3-030-01249-6_37",
language = "English",
isbn = "9783030012489",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "614--631",
booktitle = "Computer Vision",
address = "Germany",
note = "15th European Conference on Computer Vision, ECCV 2018 ; Conference date: 08-09-2018 Through 14-09-2018",

}

Download

TY - GEN

T1 - Recovering accurate 3D human pose in the wild using IMUs and a moving camera

AU - von Marcard, Timo

AU - Henschel, Roberto

AU - Black, Michael J.

AU - Rosenhahn, Bodo

AU - Pons-Moll, Gerard

PY - 2018/10/6

Y1 - 2018/10/6

N2 - In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

AB - In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

KW - 2D to 3D

KW - 3D pose dataset

KW - Human pose

KW - IMUs

KW - People tracking

KW - Sensor fusion

KW - Video

UR - http://www.scopus.com/inward/record.url?scp=85055090644&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01249-6_37

DO - 10.1007/978-3-030-01249-6_37

M3 - Conference contribution

AN - SCOPUS:85055090644

SN - 9783030012489

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 614

EP - 631

BT - Computer Vision

PB - Springer Verlag

T2 - 15th European Conference on Computer Vision, ECCV 2018

Y2 - 8 September 2018 through 14 September 2018

ER -

Von denselben Autoren