Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
Seiten | 7774-7783 |
Seitenumfang | 10 |
ISBN (elektronisch) | 9781728132938 |
Publikationsstatus | Veröffentlicht - 2019 |
Veranstaltung | 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, USA / Vereinigte Staaten Dauer: 16 Juni 2019 → 20 Juni 2019 |
Publikationsreihe
Name | IEEE Conference on Computer Vision and Pattern Recognition |
---|---|
ISSN (Print) | 1063-6919 |
ISSN (elektronisch) | 2575-7075 |
Abstract
This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Informatik (insg.)
- Maschinelles Sehen und Mustererkennung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019. S. 7774-7783 8953653 (IEEE Conference on Computer Vision and Pattern Recognition).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - RepNet
T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
AU - Wandt, Bastian
AU - Rosenhahn, Bodo
PY - 2019
Y1 - 2019
N2 - This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.
AB - This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.
KW - 3D from Single Image
KW - And Body Pose
KW - Face
KW - Gesture
UR - http://www.scopus.com/inward/record.url?scp=85074369170&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2019.00797
DO - 10.1109/CVPR.2019.00797
M3 - Conference contribution
AN - SCOPUS:85074369170
SN - 978-1-7281-3294-5
T3 - IEEE Conference on Computer Vision and Pattern Recognition
SP - 7774
EP - 7783
BT - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Y2 - 16 June 2019 through 20 June 2019
ER -