Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds

Publikation: Arbeitspapier/PreprintPreprint

Autorschaft

  • Sontje Ihler
  • Max-Heinrich Laves
  • Tobias Ortmaier

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seitenumfang4
PublikationsstatusElektronisch veröffentlicht (E-Pub) - 2019

Abstract

Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.
Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.
The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.

Zitieren

Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds. / Ihler, Sontje; Laves, Max-Heinrich; Ortmaier, Tobias.
2019.

Publikation: Arbeitspapier/PreprintPreprint

Ihler, Sontje ; Laves, Max-Heinrich ; Ortmaier, Tobias. / Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds. 2019.
Download
@techreport{7cd4e05abba5423993ab9e9830426929,
title = "Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds",
abstract = "Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.",
author = "Sontje Ihler and Max-Heinrich Laves and Tobias Ortmaier",
year = "2019",
language = "English",
type = "WorkingPaper",

}

Download

TY - UNPB

T1 - Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds

AU - Ihler, Sontje

AU - Laves, Max-Heinrich

AU - Ortmaier, Tobias

PY - 2019

Y1 - 2019

N2 - Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.

AB - Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.

M3 - Preprint

BT - Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds

ER -