Recovering accurate 3D human pose in the wild using IMUs and a moving camera

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Timo von Marcard
  • Roberto Henschel
  • Michael J. Black
  • Bodo Rosenhahn
  • Gerard Pons-Moll

Research Organisations

External Research Organisations

  • Max Planck Institute for Intelligent Systems
  • Max-Planck Institute for Informatics
View graph of relations

Details

Original languageEnglish
Title of host publicationComputer Vision
Subtitle of host publicationECCV 2018 - 15th European Conference, 2018, Proceedings
PublisherSpringer Verlag
Pages614-631
Number of pages18
ISBN (print)9783030012489
Publication statusPublished - 6 Oct 2018
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: 8 Sept 201814 Sept 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11214 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

Keywords

    2D to 3D, 3D pose dataset, Human pose, IMUs, People tracking, Sensor fusion, Video

ASJC Scopus subject areas

Cite this

Recovering accurate 3D human pose in the wild using IMUs and a moving camera. / von Marcard, Timo; Henschel, Roberto; Black, Michael J. et al.
Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag, 2018. p. 614-631 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11214 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

von Marcard, T, Henschel, R, Black, MJ, Rosenhahn, B & Pons-Moll, G 2018, Recovering accurate 3D human pose in the wild using IMUs and a moving camera. in Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11214 LNCS, Springer Verlag, pp. 614-631, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 8 Sept 2018. https://doi.org/10.1007/978-3-030-01249-6_37
von Marcard, T., Henschel, R., Black, M. J., Rosenhahn, B., & Pons-Moll, G. (2018). Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 614-631). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11214 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01249-6_37
von Marcard T, Henschel R, Black MJ, Rosenhahn B, Pons-Moll G. Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. p. 614-631. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-01249-6_37
von Marcard, Timo ; Henschel, Roberto ; Black, Michael J. et al. / Recovering accurate 3D human pose in the wild using IMUs and a moving camera. Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag, 2018. pp. 614-631 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{f2c6186d6e0640ef98f35c054eb9c6cc,
title = "Recovering accurate 3D human pose in the wild using IMUs and a moving camera",
abstract = "In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.",
keywords = "2D to 3D, 3D pose dataset, Human pose, IMUs, People tracking, Sensor fusion, Video",
author = "{von Marcard}, Timo and Roberto Henschel and Black, {Michael J.} and Bodo Rosenhahn and Gerard Pons-Moll",
year = "2018",
month = oct,
day = "6",
doi = "10.1007/978-3-030-01249-6_37",
language = "English",
isbn = "9783030012489",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "614--631",
booktitle = "Computer Vision",
address = "Germany",
note = "15th European Conference on Computer Vision, ECCV 2018 ; Conference date: 08-09-2018 Through 14-09-2018",

}

Download

TY - GEN

T1 - Recovering accurate 3D human pose in the wild using IMUs and a moving camera

AU - von Marcard, Timo

AU - Henschel, Roberto

AU - Black, Michael J.

AU - Rosenhahn, Bodo

AU - Pons-Moll, Gerard

PY - 2018/10/6

Y1 - 2018/10/6

N2 - In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

AB - In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

KW - 2D to 3D

KW - 3D pose dataset

KW - Human pose

KW - IMUs

KW - People tracking

KW - Sensor fusion

KW - Video

UR - http://www.scopus.com/inward/record.url?scp=85055090644&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01249-6_37

DO - 10.1007/978-3-030-01249-6_37

M3 - Conference contribution

AN - SCOPUS:85055090644

SN - 9783030012489

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 614

EP - 631

BT - Computer Vision

PB - Springer Verlag

T2 - 15th European Conference on Computer Vision, ECCV 2018

Y2 - 8 September 2018 through 14 September 2018

ER -

By the same author(s)