Canonpose: Self-supervised monocular 3D human pose estimation in the wild

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

Externe Organisationen

  • University of British Columbia
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
Herausgeber (Verlag)IEEE Computer Society
Seiten13289-13299
Seitenumfang11
ISBN (elektronisch)9781665445092
ISBN (Print)978-1-6654-4510-8
PublikationsstatusVeröffentlicht - 2021
Veranstaltung2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Virtual, Online, USA / Vereinigte Staaten
Dauer: 20 Juni 202125 Juni 2021

Publikationsreihe

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919
ISSN (elektronisch)2575-7075

Abstract

Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (e.g. outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data. To this end, we exploit multi-view consistency constraints to disentangle the observed 2D pose into the underlying 3D pose and camera rotation. In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras. Nevertheless, in the case of a static camera setup, we present an optional extension to include constant relative camera rotations over multiple views into our framework. Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples. The proposed approach is evaluated on two benchmark datasets (Human3.6M and MPII-INF-3DHP) and on the in-the-wild SkiPose dataset.

ASJC Scopus Sachgebiete

Zitieren

Canonpose: Self-supervised monocular 3D human pose estimation in the wild. / Wandt, Bastian; Rudolph, Marco; Zell, Petrissa et al.
Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society, 2021. S. 13289-13299 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Wandt, B, Rudolph, M, Zell, P, Rhodin, H & Rosenhahn, B 2021, Canonpose: Self-supervised monocular 3D human pose estimation in the wild. in Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, S. 13289-13299, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, Online, Tennessee, USA / Vereinigte Staaten, 20 Juni 2021. https://doi.org/10.48550/arXiv.2011.14679, https://doi.org/10.1109/CVPR46437.2021.01309
Wandt, B., Rudolph, M., Zell, P., Rhodin, H., & Rosenhahn, B. (2021). Canonpose: Self-supervised monocular 3D human pose estimation in the wild. In Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 (S. 13289-13299). (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.48550/arXiv.2011.14679, https://doi.org/10.1109/CVPR46437.2021.01309
Wandt B, Rudolph M, Zell P, Rhodin H, Rosenhahn B. Canonpose: Self-supervised monocular 3D human pose estimation in the wild. in Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society. 2021. S. 13289-13299. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.48550/arXiv.2011.14679, 10.1109/CVPR46437.2021.01309
Wandt, Bastian ; Rudolph, Marco ; Zell, Petrissa et al. / Canonpose : Self-supervised monocular 3D human pose estimation in the wild. Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society, 2021. S. 13289-13299 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
Download
@inproceedings{60f801b995454997bade4ee9a1f9379c,
title = "Canonpose: Self-supervised monocular 3D human pose estimation in the wild",
abstract = "Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (e.g. outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data. To this end, we exploit multi-view consistency constraints to disentangle the observed 2D pose into the underlying 3D pose and camera rotation. In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras. Nevertheless, in the case of a static camera setup, we present an optional extension to include constant relative camera rotations over multiple views into our framework. Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples. The proposed approach is evaluated on two benchmark datasets (Human3.6M and MPII-INF-3DHP) and on the in-the-wild SkiPose dataset.",
author = "Bastian Wandt and Marco Rudolph and Petrissa Zell and Helge Rhodin and Bodo Rosenhahn",
note = "Funding Information: This work was partially supported by the Federal Ministry of Education and Research (BMBF), Germany under the project LeibnizKILabor (grant no. 01DD20003) and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany{\textquoteright}s Excellence Strategy within the Cluster of Excellence PhoenixD (EXC 2122). ; 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 ; Conference date: 20-06-2021 Through 25-06-2021",
year = "2021",
doi = "10.48550/arXiv.2011.14679",
language = "English",
isbn = "978-1-6654-4510-8",
series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",
publisher = "IEEE Computer Society",
pages = "13289--13299",
booktitle = "Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021",
address = "United States",

}

Download

TY - GEN

T1 - Canonpose

T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021

AU - Wandt, Bastian

AU - Rudolph, Marco

AU - Zell, Petrissa

AU - Rhodin, Helge

AU - Rosenhahn, Bodo

N1 - Funding Information: This work was partially supported by the Federal Ministry of Education and Research (BMBF), Germany under the project LeibnizKILabor (grant no. 01DD20003) and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy within the Cluster of Excellence PhoenixD (EXC 2122).

PY - 2021

Y1 - 2021

N2 - Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (e.g. outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data. To this end, we exploit multi-view consistency constraints to disentangle the observed 2D pose into the underlying 3D pose and camera rotation. In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras. Nevertheless, in the case of a static camera setup, we present an optional extension to include constant relative camera rotations over multiple views into our framework. Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples. The proposed approach is evaluated on two benchmark datasets (Human3.6M and MPII-INF-3DHP) and on the in-the-wild SkiPose dataset.

AB - Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (e.g. outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data. To this end, we exploit multi-view consistency constraints to disentangle the observed 2D pose into the underlying 3D pose and camera rotation. In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras. Nevertheless, in the case of a static camera setup, we present an optional extension to include constant relative camera rotations over multiple views into our framework. Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples. The proposed approach is evaluated on two benchmark datasets (Human3.6M and MPII-INF-3DHP) and on the in-the-wild SkiPose dataset.

UR - http://www.scopus.com/inward/record.url?scp=85108568342&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2011.14679

DO - 10.48550/arXiv.2011.14679

M3 - Conference contribution

AN - SCOPUS:85108568342

SN - 978-1-6654-4510-8

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 13289

EP - 13299

BT - Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021

PB - IEEE Computer Society

Y2 - 20 June 2021 through 25 June 2021

ER -

Von denselben Autoren