Semantic data integration techniques for transforming big biomedical data into actionable knowledge

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Maria Esther Vidal
  • Samaneh Jozashoori

Organisationseinheiten

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019
UntertitelProceedings
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten563-566
Seitenumfang4
ISBN (elektronisch)978-1-7281-2286-1
ISBN (Print)978-1-7281-2287-8
PublikationsstatusVeröffentlicht - Juni 2019
Veranstaltung32nd IEEE International Symposium on Computer-Based Medical Systems, CBMS 2019 - Cordoba, Spanien
Dauer: 5 Juni 20197 Juni 2019

Publikationsreihe

NameProceedings - IEEE Symposium on Computer-Based Medical Systems
Band2019-June
ISSN (Print)1063-7125
ISSN (elektronisch)2372-9198

Abstract

FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.

ASJC Scopus Sachgebiete

Ziele für nachhaltige Entwicklung

Zitieren

Semantic data integration techniques for transforming big biomedical data into actionable knowledge. / Vidal, Maria Esther; Jozashoori, Samaneh.
2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019: Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. S. 563-566 8787394 (Proceedings - IEEE Symposium on Computer-Based Medical Systems; Band 2019-June).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Vidal, ME & Jozashoori, S 2019, Semantic data integration techniques for transforming big biomedical data into actionable knowledge. in 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019: Proceedings., 8787394, Proceedings - IEEE Symposium on Computer-Based Medical Systems, Bd. 2019-June, Institute of Electrical and Electronics Engineers Inc., S. 563-566, 32nd IEEE International Symposium on Computer-Based Medical Systems, CBMS 2019, Cordoba, Spanien, 5 Juni 2019. https://doi.org/10.1109/CBMS.2019.00116
Vidal, M. E., & Jozashoori, S. (2019). Semantic data integration techniques for transforming big biomedical data into actionable knowledge. In 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019: Proceedings (S. 563-566). Artikel 8787394 (Proceedings - IEEE Symposium on Computer-Based Medical Systems; Band 2019-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CBMS.2019.00116
Vidal ME, Jozashoori S. Semantic data integration techniques for transforming big biomedical data into actionable knowledge. in 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019: Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. S. 563-566. 8787394. (Proceedings - IEEE Symposium on Computer-Based Medical Systems). doi: 10.1109/CBMS.2019.00116
Vidal, Maria Esther ; Jozashoori, Samaneh. / Semantic data integration techniques for transforming big biomedical data into actionable knowledge. 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019: Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. S. 563-566 (Proceedings - IEEE Symposium on Computer-Based Medical Systems).
Download
@inproceedings{996ae182eac045a9af685702d8353e85,
title = "Semantic data integration techniques for transforming big biomedical data into actionable knowledge",
abstract = "FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.",
keywords = "Big Data, Biomedical Data, Knowledge Graph, Natural Language Processing, Semantic Data Integration",
author = "Vidal, {Maria Esther} and Samaneh Jozashoori",
note = "Funding information: This work has been supported by the European Union{\textquoteright}s Horizon 2020 Research and Innovation Program for the project iASiS with grant agreement No 727658.; 32nd IEEE International Symposium on Computer-Based Medical Systems, CBMS 2019 ; Conference date: 05-06-2019 Through 07-06-2019",
year = "2019",
month = jun,
doi = "10.1109/CBMS.2019.00116",
language = "English",
isbn = "978-1-7281-2287-8",
series = "Proceedings - IEEE Symposium on Computer-Based Medical Systems",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "563--566",
booktitle = "2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019",
address = "United States",

}

Download

TY - GEN

T1 - Semantic data integration techniques for transforming big biomedical data into actionable knowledge

AU - Vidal, Maria Esther

AU - Jozashoori, Samaneh

N1 - Funding information: This work has been supported by the European Union’s Horizon 2020 Research and Innovation Program for the project iASiS with grant agreement No 727658.

PY - 2019/6

Y1 - 2019/6

N2 - FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.

AB - FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.

KW - Big Data

KW - Biomedical Data

KW - Knowledge Graph

KW - Natural Language Processing

KW - Semantic Data Integration

UR - http://www.scopus.com/inward/record.url?scp=85070971867&partnerID=8YFLogxK

U2 - 10.1109/CBMS.2019.00116

DO - 10.1109/CBMS.2019.00116

M3 - Conference contribution

AN - SCOPUS:85070971867

SN - 978-1-7281-2287-8

T3 - Proceedings - IEEE Symposium on Computer-Based Medical Systems

SP - 563

EP - 566

BT - 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 32nd IEEE International Symposium on Computer-Based Medical Systems, CBMS 2019

Y2 - 5 June 2019 through 7 June 2019

ER -