Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds

Research output: Working paper/PreprintPreprint

Authors

Research Organisations

View graph of relations

Details

Original languageUndefined/Unknown
Number of pages4
Publication statusE-pub ahead of print - 2019

Abstract

Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.
Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.
The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.

Cite this

Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds. / Ihler, Sontje; Laves, Max-Heinrich; Ortmaier, Tobias.
2019.

Research output: Working paper/PreprintPreprint

Download
@techreport{7cd4e05abba5423993ab9e9830426929,
title = "Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds",
abstract = "Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.",
author = "Sontje Ihler and Max-Heinrich Laves and Tobias Ortmaier",
year = "2019",
language = "Undefined/Unknown",
type = "WorkingPaper",

}

Download

TY - UNPB

T1 - Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds

AU - Ihler, Sontje

AU - Laves, Max-Heinrich

AU - Ortmaier, Tobias

PY - 2019

Y1 - 2019

N2 - Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.

AB - Our vision is a motion model of the oscillating vocal fodls that can prospectively be used for motion prediction and anomaly detection in laryngeal laser surgery during phonation. In this work we propose to learn motion concepts and global motion correlations of the vocal folds and surrounding tissue from endoscopic images using manifold learning based on a variational autoencoder.Our experiments show that the basic concepts (e.g. distance) are encoded in the latent representation of our learned manifold. It is also possible to distinguish between a relaxed and a contracted larynx (during phonation). It is further possible to identify the stages of phonation based on the latent embedding. The sequence of the latent variables seems structured and presumably suited for prediction tasks. Anomalies in the input data are clearly visible in the latent embedding as they are not within the subspace of the motion manifold. The motion model represents a strong prior belief about vocal fold motion.The proposed method seems to be a promising approach in generating motion and oscillation models of the vocal folds. It seems feasible for future motion prediction and anomaly detection. A more in-depth assessment with extension to higher-level models is planned.

M3 - Preprint

BT - Towards Manifold Learning of Image-Based Motion Models for Oscillating Vocal Folds

ER -

By the same author(s)