A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Bin Liu; Bodo Rosenhahn; Thomas Illig; David S. DeLuca

doi:10.1371/journal.pcbi.1011198

Details

Originalsprache	Englisch
Aufsatznummer	e1011198
Seitenumfang	22
Fachzeitschrift	PLoS Computational Biology
Jahrgang	20
Ausgabenummer	7
Publikationsstatus	Veröffentlicht - 3 Juli 2024

Abstract

Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.

ASJC Scopus Sachgebiete

Agrar- und Biowissenschaften (insg.)
Ökologie, Evolution, Verhaltenswissenschaften und Systematik
Mathematik (insg.)
Modellierung und Simulation
Umweltwissenschaften (insg.)
Ökologie
Biochemie, Genetik und Molekularbiologie (insg.)
Molekularbiologie
Biochemie, Genetik und Molekularbiologie (insg.)
Genetik
Neurowissenschaften (insg.)
Zelluläre und Molekulare Neurowissenschaften
Informatik (insg.)
Theoretische Informatik und Mathematik

Zitieren

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data. / Liu, Bin; Rosenhahn, Bodo; Illig, Thomas et al.
in: PLoS Computational Biology, Jahrgang 20, Nr. 7 , e1011198, 03.07.2024.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

Liu, B, Rosenhahn, B, Illig, T & DeLuca, DS 2024, 'A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data', PLoS Computational Biology, Jg. 20, Nr. 7 , e1011198. https://doi.org/10.1371/journal.pcbi.1011198

Liu, B., Rosenhahn, B., Illig, T., & DeLuca, D. S. (2024). A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data. PLoS Computational Biology, 20(7 ), Artikel e1011198. https://doi.org/10.1371/journal.pcbi.1011198

Liu B, Rosenhahn B, Illig T, DeLuca DS. A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data. PLoS Computational Biology. 2024 Jul 3;20(7 ):e1011198. doi: 10.1371/journal.pcbi.1011198

Liu, Bin ; Rosenhahn, Bodo ; Illig, Thomas et al. / A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data. in: PLoS Computational Biology. 2024 ; Jahrgang 20, Nr. 7 .

Download

@article{c273f1c7b31441ff9fe57c8fdb6cf384,

title = "A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data",

abstract = "Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.",

author = "Bin Liu and Bodo Rosenhahn and Thomas Illig and DeLuca, {David S.}",

note = "Publisher Copyright: {\textcopyright} 2024 Liu et al.",

year = "2024",

month = jul,

day = "3",

doi = "10.1371/journal.pcbi.1011198",

language = "English",

volume = "20",

journal = "PLoS Computational Biology",

issn = "1553-734X",

publisher = "Public Library of Science",

number = "7 ",

}

Download

TY - JOUR

T1 - A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

AU - Liu, Bin

AU - Rosenhahn, Bodo

AU - Illig, Thomas

AU - DeLuca, David S.

PY - 2024/7/3

Y1 - 2024/7/3

N2 - Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.

AB - Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.

UR - http://www.scopus.com/inward/record.url?scp=85197657503&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1011198

DO - 10.1371/journal.pcbi.1011198

M3 - Article

C2 - 38959284

AN - SCOPUS:85197657503

VL - 20

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 7

M1 - e1011198

ER -

Research@Leibniz University

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction