Loading [MathJax]/extensions/tex2jax.js

RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Seiten7774-7783
Seitenumfang10
ISBN (elektronisch)9781728132938
PublikationsstatusVeröffentlicht - 2019
Veranstaltung32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, USA / Vereinigte Staaten
Dauer: 16 Juni 201920 Juni 2019

Publikationsreihe

NameIEEE Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919
ISSN (elektronisch)2575-7075

Abstract

This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

ASJC Scopus Sachgebiete

Zitieren

RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. / Wandt, Bastian; Rosenhahn, Bodo.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. S. 7774-7783 8953653 (IEEE Conference on Computer Vision and Pattern Recognition).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Wandt, B & Rosenhahn, B 2019, RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition., 8953653, IEEE Conference on Computer Vision and Pattern Recognition, S. 7774-7783, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, USA / Vereinigte Staaten, 16 Juni 2019. https://doi.org/10.1109/CVPR.2019.00797
Wandt, B., & Rosenhahn, B. (2019). RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (S. 7774-7783). Artikel 8953653 (IEEE Conference on Computer Vision and Pattern Recognition). https://doi.org/10.1109/CVPR.2019.00797
Wandt B, Rosenhahn B. RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. S. 7774-7783. 8953653. (IEEE Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.00797
Wandt, Bastian ; Rosenhahn, Bodo. / RepNet : Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. S. 7774-7783 (IEEE Conference on Computer Vision and Pattern Recognition).
Download
@inproceedings{d2bba294f44e4a82b641e83b55f59edd,
title = "RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation",
abstract = "This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.",
keywords = "3D from Single Image, And Body Pose, Face, Gesture",
author = "Bastian Wandt and Bodo Rosenhahn",
year = "2019",
doi = "10.1109/CVPR.2019.00797",
language = "English",
isbn = "978-1-7281-3294-5",
series = "IEEE Conference on Computer Vision and Pattern Recognition",
pages = "7774--7783",
booktitle = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",
note = "32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

}

Download

TY - GEN

T1 - RepNet

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

AU - Wandt, Bastian

AU - Rosenhahn, Bodo

PY - 2019

Y1 - 2019

N2 - This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

AB - This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

KW - 3D from Single Image

KW - And Body Pose

KW - Face

KW - Gesture

UR - http://www.scopus.com/inward/record.url?scp=85074369170&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.00797

DO - 10.1109/CVPR.2019.00797

M3 - Conference contribution

AN - SCOPUS:85074369170

SN - 978-1-7281-3294-5

T3 - IEEE Conference on Computer Vision and Pattern Recognition

SP - 7774

EP - 7783

BT - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Y2 - 16 June 2019 through 20 June 2019

ER -

Von denselben Autoren