RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation

Bastian Wandt; Bodo Rosenhahn

doi:10.1109/CVPR.2019.00797

Details

Original language	English
Title of host publication	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Pages	7774-7783
Number of pages	10
ISBN (electronic)	9781728132938
Publication status	Published - 2019
Event	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States Duration: 16 Jun 2019 → 20 Jun 2019

Publication series

Name	IEEE Conference on Computer Vision and Pattern Recognition
ISSN (Print)	1063-6919
ISSN (electronic)	2575-7075

Abstract

This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

Keywords

3D from Single Image, And Body Pose, Face, Gesture

ASJC Scopus subject areas

Computer Science(all)
Software
Computer Science(all)
Computer Vision and Pattern Recognition

Cite this

RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. / Wandt, Bastian; Rosenhahn, Bodo.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. p. 7774-7783 8953653 (IEEE Conference on Computer Vision and Pattern Recognition).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Wandt, B & Rosenhahn, B 2019, RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition., 8953653, IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774-7783, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, United States, 16 Jun 2019. https://doi.org/10.1109/CVPR.2019.00797

Wandt, B., & Rosenhahn, B. (2019). RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 7774-7783). Article 8953653 (IEEE Conference on Computer Vision and Pattern Recognition). https://doi.org/10.1109/CVPR.2019.00797

Wandt B, Rosenhahn B. RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. p. 7774-7783. 8953653. (IEEE Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.00797

Wandt, Bastian ; Rosenhahn, Bodo. / RepNet : Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. pp. 7774-7783 (IEEE Conference on Computer Vision and Pattern Recognition).

Download

@inproceedings{d2bba294f44e4a82b641e83b55f59edd,

title = "RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation",

abstract = "This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.",

keywords = "3D from Single Image, And Body Pose, Face, Gesture",

author = "Bastian Wandt and Bodo Rosenhahn",

year = "2019",

doi = "10.1109/CVPR.2019.00797",

language = "English",

isbn = "978-1-7281-3294-5",

series = "IEEE Conference on Computer Vision and Pattern Recognition",

pages = "7774--7783",

booktitle = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

note = "32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

}

Download

TY - GEN

T1 - RepNet

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

AU - Wandt, Bastian

AU - Rosenhahn, Bodo

PY - 2019

Y1 - 2019

N2 - This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

AB - This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

KW - 3D from Single Image

KW - And Body Pose

KW - Face

KW - Gesture

UR - http://www.scopus.com/inward/record.url?scp=85074369170&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.00797

DO - 10.1109/CVPR.2019.00797

M3 - Conference contribution

AN - SCOPUS:85074369170

SN - 978-1-7281-3294-5

T3 - IEEE Conference on Computer Vision and Pattern Recognition

SP - 7774

EP - 7783

BT - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Y2 - 16 June 2019 through 20 June 2019

ER -

Research@Leibniz University

RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation

Authors

Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Automl for Multi-Class Anomaly Compensation of Sensor Drift

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection