Recovering accurate 3D human pose in the wild using IMUs and a moving camera

Timo von Marcard; Roberto Henschel; Michael J. Black; Bodo Rosenhahn; Gerard Pons-Moll

doi:10.1007/978-3-030-01249-6_37

Details

Original language	English
Title of host publication	Computer Vision
Subtitle of host publication	ECCV 2018 - 15th European Conference, 2018, Proceedings
Publisher	Springer Verlag
Pages	614-631
Number of pages	18
ISBN (print)	9783030012489
Publication status	Published - 6 Oct 2018
Event	15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany Duration: 8 Sept 2018 → 14 Sept 2018

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11214 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

Keywords

2D to 3D, 3D pose dataset, Human pose, IMUs, People tracking, Sensor fusion, Video

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

Recovering accurate 3D human pose in the wild using IMUs and a moving camera. / von Marcard, Timo; Henschel, Roberto; Black, Michael J. et al.
Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag, 2018. p. 614-631 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11214 LNCS).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

von Marcard, T, Henschel, R, Black, MJ, Rosenhahn, B & Pons-Moll, G 2018, Recovering accurate 3D human pose in the wild using IMUs and a moving camera. in Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11214 LNCS, Springer Verlag, pp. 614-631, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 8 Sept 2018. https://doi.org/10.1007/978-3-030-01249-6_37

von Marcard, T., Henschel, R., Black, M. J., Rosenhahn, B., & Pons-Moll, G. (2018). Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 614-631). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11214 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01249-6_37

von Marcard T, Henschel R, Black MJ, Rosenhahn B, Pons-Moll G. Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. p. 614-631. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-01249-6_37

von Marcard, Timo ; Henschel, Roberto ; Black, Michael J. et al. / Recovering accurate 3D human pose in the wild using IMUs and a moving camera. Computer Vision: ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag, 2018. pp. 614-631 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{f2c6186d6e0640ef98f35c054eb9c6cc,

title = "Recovering accurate 3D human pose in the wild using IMUs and a moving camera",

abstract = "In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.",

keywords = "2D to 3D, 3D pose dataset, Human pose, IMUs, People tracking, Sensor fusion, Video",

author = "{von Marcard}, Timo and Roberto Henschel and Black, {Michael J.} and Bodo Rosenhahn and Gerard Pons-Moll",

year = "2018",

month = oct,

day = "6",

doi = "10.1007/978-3-030-01249-6_37",

language = "English",

isbn = "9783030012489",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "614--631",

booktitle = "Computer Vision",

address = "Germany",

note = "15th European Conference on Computer Vision, ECCV 2018 ; Conference date: 08-09-2018 Through 14-09-2018",

}

Download

TY - GEN

T1 - Recovering accurate 3D human pose in the wild using IMUs and a moving camera

AU - von Marcard, Timo

AU - Henschel, Roberto

AU - Black, Michael J.

AU - Rosenhahn, Bodo

AU - Pons-Moll, Gerard

PY - 2018/10/6

Y1 - 2018/10/6

N2 - In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

AB - In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW), a new dataset consisting of more than 51, 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

KW - 2D to 3D

KW - 3D pose dataset

KW - Human pose

KW - IMUs

KW - People tracking

KW - Sensor fusion

KW - Video

UR - http://www.scopus.com/inward/record.url?scp=85055090644&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01249-6_37

DO - 10.1007/978-3-030-01249-6_37

M3 - Conference contribution

AN - SCOPUS:85055090644

SN - 9783030012489

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 614

EP - 631

BT - Computer Vision

PB - Springer Verlag

T2 - 15th European Conference on Computer Vision, ECCV 2018

Y2 - 8 September 2018 through 14 September 2018

ER -

Research@Leibniz University

Recovering accurate 3D human pose in the wild using IMUs and a moving camera

Authors

Research Organisations

External Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus

Q-SENN: Quantized Self-Explaining Neural Networks