Details
Original language | English |
---|---|
Article number | 16259 |
Journal | Scientific reports |
Volume | 12 |
Publication status | Published - 28 Sept 2022 |
Abstract
Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach’s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.
ASJC Scopus subject areas
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Scientific reports, Vol. 12, 16259, 28.09.2022.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - A message passing framework with multiple data integration for miRNA-disease association prediction
AU - Dong, Thi Ngan
AU - Schrader, Johanna
AU - Mücke, Stefanie
AU - Khosla, Megha
N1 - Funding Information: Open Access funding enabled and organized by Projekt DEAL. The work is partially supported by Volkswagenstiftung and the Ministry for Science and Culture of Lower Saxony, Germany (MWK: Ministerium für Wissenschaft und Kultur) under the PRESENt (Grant No. 11-76251-99-3/19 (ZN3434)) and the “Understanding Cochlear Implant Outcome Variability using Big Data and Machine Learning Approaches” (Grant No. ZN3429) projects; and the Federal Ministry of Education and Research (BMBF: Bundesministerium für Bildung und Forschung), Germany, under the LeibnizKILabor (Grant No. 01DD20003), and the NUKLEUS (Grant No. 01KX2021) projects.
PY - 2022/9/28
Y1 - 2022/9/28
N2 - Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach’s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.
AB - Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach’s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.
UR - http://www.scopus.com/inward/record.url?scp=85138958542&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-20529-5
DO - 10.1038/s41598-022-20529-5
M3 - Article
C2 - 36171337
AN - SCOPUS:85138958542
VL - 12
JO - Scientific reports
JF - Scientific reports
SN - 2045-2322
M1 - 16259
ER -