Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021) |
Herausgeber/-innen | Vitomir Struc, Marija Ivanovska |
Seiten | 1-8 |
Seitenumfang | 8 |
ISBN (elektronisch) | 978-1-6654-3176-7 |
Publikationsstatus | Veröffentlicht - 2021 |
Publikationsreihe
Name | Proceedings - 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2021 |
---|
Abstract
Human head pose estimation from images plays a vital role in applications like driver assistance systems and human behavior analysis. Head pose estimation networks are typically trained in a supervised manner. Unfortunately, manual/sensor-based annotations of head poses are prone to errors. A solution is supervised training on synthetic training data generated from 3D face models which can provide an infinite amount of perfect labels. However, computer generated face images only provide an approximation of real-world images which results in a domain gap between training and application domain. To date, domain adaptation is rarely addressed in current work on head pose estimation. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. It allows simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap, while keeping the advantages of synthetic data. Consistency regularization enforces consistent network predictions under random image augmentations. We address pose-preserving and pose-altering augmentations. Naturally, pose-altering augmentations cannot be used on unlabeled data. We therefore propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs. This allows the network to benefit from relative pose labels during training on the unlabeled, real-world images. We evaluate our approach on a widely used benchmark (Biwi Kinect Head Pose) and outperform domain-adaptation SOTA. We are the first to present a consistency regularization framework for head pose estimation. Our experiments show that our approach improves head pose estimation accuracy for real-world images despite using only labels from synthetic images.
ASJC Scopus Sachgebiete
- Mathematik (insg.)
- Steuerung und Optimierung
- Informatik (insg.)
- Maschinelles Sehen und Mustererkennung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). Hrsg. / Vitomir Struc; Marija Ivanovska. 2021. S. 1-8 (Proceedings - 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2021).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Relative Pose Consistency for Semi-Supervised Head Pose Estimation
AU - Kuhnke, Felix
AU - Ihler, Sontje
AU - Ostermann, Jörn
PY - 2021
Y1 - 2021
N2 - Human head pose estimation from images plays a vital role in applications like driver assistance systems and human behavior analysis. Head pose estimation networks are typically trained in a supervised manner. Unfortunately, manual/sensor-based annotations of head poses are prone to errors. A solution is supervised training on synthetic training data generated from 3D face models which can provide an infinite amount of perfect labels. However, computer generated face images only provide an approximation of real-world images which results in a domain gap between training and application domain. To date, domain adaptation is rarely addressed in current work on head pose estimation. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. It allows simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap, while keeping the advantages of synthetic data. Consistency regularization enforces consistent network predictions under random image augmentations. We address pose-preserving and pose-altering augmentations. Naturally, pose-altering augmentations cannot be used on unlabeled data. We therefore propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs. This allows the network to benefit from relative pose labels during training on the unlabeled, real-world images. We evaluate our approach on a widely used benchmark (Biwi Kinect Head Pose) and outperform domain-adaptation SOTA. We are the first to present a consistency regularization framework for head pose estimation. Our experiments show that our approach improves head pose estimation accuracy for real-world images despite using only labels from synthetic images.
AB - Human head pose estimation from images plays a vital role in applications like driver assistance systems and human behavior analysis. Head pose estimation networks are typically trained in a supervised manner. Unfortunately, manual/sensor-based annotations of head poses are prone to errors. A solution is supervised training on synthetic training data generated from 3D face models which can provide an infinite amount of perfect labels. However, computer generated face images only provide an approximation of real-world images which results in a domain gap between training and application domain. To date, domain adaptation is rarely addressed in current work on head pose estimation. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. It allows simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap, while keeping the advantages of synthetic data. Consistency regularization enforces consistent network predictions under random image augmentations. We address pose-preserving and pose-altering augmentations. Naturally, pose-altering augmentations cannot be used on unlabeled data. We therefore propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs. This allows the network to benefit from relative pose labels during training on the unlabeled, real-world images. We evaluate our approach on a widely used benchmark (Biwi Kinect Head Pose) and outperform domain-adaptation SOTA. We are the first to present a consistency regularization framework for head pose estimation. Our experiments show that our approach improves head pose estimation accuracy for real-world images despite using only labels from synthetic images.
UR - http://www.scopus.com/inward/record.url?scp=85125094373&partnerID=8YFLogxK
U2 - 10.1109/FG52635.2021.9666992
DO - 10.1109/FG52635.2021.9666992
M3 - Conference contribution
SN - 978-1-6654-3177-4
T3 - Proceedings - 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2021
SP - 1
EP - 8
BT - 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)
A2 - Struc, Vitomir
A2 - Ivanovska, Marija
ER -