Details
Original language | English |
---|---|
Article number | e1011198 |
Number of pages | 22 |
Journal | PLoS Computational Biology |
Volume | 20 |
Issue number | 7 |
Publication status | Published - 3 Jul 2024 |
Abstract
Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.
ASJC Scopus subject areas
- Agricultural and Biological Sciences(all)
- Ecology, Evolution, Behavior and Systematics
- Mathematics(all)
- Modelling and Simulation
- Environmental Science(all)
- Ecology
- Biochemistry, Genetics and Molecular Biology(all)
- Molecular Biology
- Biochemistry, Genetics and Molecular Biology(all)
- Genetics
- Neuroscience(all)
- Cellular and Molecular Neuroscience
- Computer Science(all)
- Computational Theory and Mathematics
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: PLoS Computational Biology, Vol. 20, No. 7 , e1011198, 03.07.2024.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data
AU - Liu, Bin
AU - Rosenhahn, Bodo
AU - Illig, Thomas
AU - DeLuca, David S.
N1 - Publisher Copyright: © 2024 Liu et al.
PY - 2024/7/3
Y1 - 2024/7/3
N2 - Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.
AB - Interpreting transcriptome data is an important yet challenging aspect of bioinformatic analysis. While gene set enrichment analysis is a standard tool for interpreting regulatory changes, we utilize deep learning techniques, specifically autoencoder architectures, to learn latent variables that drive transcriptome signals. We investigate whether simple, variational autoencoder (VAE), and beta-weighted VAE are capable of learning reduced representations of transcriptomes that retain critical biological information. We propose a novel VAE that utilizes priors from biological data to direct the network to learn a representation of the transcriptome that is based on understandable biological concepts. After benchmarking five different autoencoder architectures, we found that each succeeded in reducing the transcriptomes to 50 latent dimensions, which captured enough variation for accurate reconstruction. The simple, fully connected autoencoder, performs best across the benchmarks, but lacks the characteristic of having directly interpretable latent dimensions. The betaweighted, prior-informed VAE implementation is able to solve the benchmarking tasks, and provide semantically accurate latent features equating to biological pathways. This study opens a new direction for differential pathway analysis in transcriptomics with increased transparency and interpretability.
UR - http://www.scopus.com/inward/record.url?scp=85197657503&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1011198
DO - 10.1371/journal.pcbi.1011198
M3 - Article
C2 - 38959284
AN - SCOPUS:85197657503
VL - 20
JO - PLoS Computational Biology
JF - PLoS Computational Biology
SN - 1553-734X
IS - 7
M1 - e1011198
ER -