A message passing framework with multiple data integration for miRNA-disease association prediction

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Thi Ngan Dong
  • Johanna Schrader
  • Stefanie Mücke
  • Megha Khosla

Research Organisations

External Research Organisations

  • Hannover Medical School (MHH)
  • Delft University of Technology
View graph of relations

Details

Original languageEnglish
Article number16259
JournalScientific reports
Volume12
Publication statusPublished - 28 Sept 2022

Abstract

Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach’s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.

ASJC Scopus subject areas

Cite this

A message passing framework with multiple data integration for miRNA-disease association prediction. / Dong, Thi Ngan; Schrader, Johanna; Mücke, Stefanie et al.
In: Scientific reports, Vol. 12, 16259, 28.09.2022.

Research output: Contribution to journalArticleResearchpeer review

Dong TN, Schrader J, Mücke S, Khosla M. A message passing framework with multiple data integration for miRNA-disease association prediction. Scientific reports. 2022 Sept 28;12:16259. doi: 10.1038/s41598-022-20529-5, 10.15488/13129
Dong, Thi Ngan ; Schrader, Johanna ; Mücke, Stefanie et al. / A message passing framework with multiple data integration for miRNA-disease association prediction. In: Scientific reports. 2022 ; Vol. 12.
Download
@article{81b0bcc671034f5480c6e3cbbe4a9789,
title = "A message passing framework with multiple data integration for miRNA-disease association prediction",
abstract = "Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach{\textquoteright}s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.",
author = "Dong, {Thi Ngan} and Johanna Schrader and Stefanie M{\"u}cke and Megha Khosla",
note = "Funding Information: Open Access funding enabled and organized by Projekt DEAL. The work is partially supported by Volkswagenstiftung and the Ministry for Science and Culture of Lower Saxony, Germany (MWK: Ministerium f{\"u}r Wissenschaft und Kultur) under the PRESENt (Grant No. 11-76251-99-3/19 (ZN3434)) and the “Understanding Cochlear Implant Outcome Variability using Big Data and Machine Learning Approaches” (Grant No. ZN3429) projects; and the Federal Ministry of Education and Research (BMBF: Bundesministerium f{\"u}r Bildung und Forschung), Germany, under the LeibnizKILabor (Grant No. 01DD20003), and the NUKLEUS (Grant No. 01KX2021) projects. ",
year = "2022",
month = sep,
day = "28",
doi = "10.1038/s41598-022-20529-5",
language = "English",
volume = "12",
journal = "Scientific reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",

}

Download

TY - JOUR

T1 - A message passing framework with multiple data integration for miRNA-disease association prediction

AU - Dong, Thi Ngan

AU - Schrader, Johanna

AU - Mücke, Stefanie

AU - Khosla, Megha

N1 - Funding Information: Open Access funding enabled and organized by Projekt DEAL. The work is partially supported by Volkswagenstiftung and the Ministry for Science and Culture of Lower Saxony, Germany (MWK: Ministerium für Wissenschaft und Kultur) under the PRESENt (Grant No. 11-76251-99-3/19 (ZN3434)) and the “Understanding Cochlear Implant Outcome Variability using Big Data and Machine Learning Approaches” (Grant No. ZN3429) projects; and the Federal Ministry of Education and Research (BMBF: Bundesministerium für Bildung und Forschung), Germany, under the LeibnizKILabor (Grant No. 01DD20003), and the NUKLEUS (Grant No. 01KX2021) projects.

PY - 2022/9/28

Y1 - 2022/9/28

N2 - Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach’s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.

AB - Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approach’s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption.

UR - http://www.scopus.com/inward/record.url?scp=85138958542&partnerID=8YFLogxK

U2 - 10.1038/s41598-022-20529-5

DO - 10.1038/s41598-022-20529-5

M3 - Article

C2 - 36171337

AN - SCOPUS:85138958542

VL - 12

JO - Scientific reports

JF - Scientific reports

SN - 2045-2322

M1 - 16259

ER -