Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019 |
Untertitel | Proceedings |
Herausgeber (Verlag) | Institute of Electrical and Electronics Engineers Inc. |
Seiten | 563-566 |
Seitenumfang | 4 |
ISBN (elektronisch) | 978-1-7281-2286-1 |
ISBN (Print) | 978-1-7281-2287-8 |
Publikationsstatus | Veröffentlicht - Juni 2019 |
Veranstaltung | 32nd IEEE International Symposium on Computer-Based Medical Systems, CBMS 2019 - Cordoba, Spanien Dauer: 5 Juni 2019 → 7 Juni 2019 |
Publikationsreihe
Name | Proceedings - IEEE Symposium on Computer-Based Medical Systems |
---|---|
Band | 2019-June |
ISSN (Print) | 1063-7125 |
ISSN (elektronisch) | 2372-9198 |
Abstract
FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.
ASJC Scopus Sachgebiete
- Medizin (insg.)
- Radiologie, Nuklearmedizin und Bildgebung
- Informatik (insg.)
- Angewandte Informatik
Ziele für nachhaltige Entwicklung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019: Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. S. 563-566 8787394 (Proceedings - IEEE Symposium on Computer-Based Medical Systems; Band 2019-June).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Semantic data integration techniques for transforming big biomedical data into actionable knowledge
AU - Vidal, Maria Esther
AU - Jozashoori, Samaneh
N1 - Funding information: This work has been supported by the European Union’s Horizon 2020 Research and Innovation Program for the project iASiS with grant agreement No 727658.
PY - 2019/6
Y1 - 2019/6
N2 - FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.
AB - FAIR principles and the Open Data initiatives have motivated the publication of large volumes of data. Specifically, in the biomedical domain, the size of the data has increased exponentially in the last decade, and with the advances in the technologies to collect and generate data, a faster growth rate is expected for the next years. The available collections of data are characterized by the dominant dimensions of big data, i.e., they are not only large in volume, but they can be also heterogeneous and present quality issues. These data complexity problems impact on the typical tasks of data management, and particularly, in the task of integrating big biomedical data sources. We tackle the problem of big data integration and present a knowledge-driven framework able to extract and integrate data collected from structured and unstructured data sources. The proposed framework resorts to Natural Language Processing techniques to extract knowledge from unstructured data and short text. Furthermore, ontologies and controlled vocabularies, e.g., UMLS, are utilized to annotate the extracted entities and relations with terms from the ontology or controlled vocabulary. The annotated data is integrated into a knowledge graph. A unified schema is used to describe the meaning of the integrated data as well as the main properties and relations. As proof of concept, we show the results of applying the proposed framework to integrate clinical records from lung cancer patients with data extracted from open data sources like Drugbank and PubMed. The created knowledge graph enables the discovery of interactions between drugs in the treatments prescribed to lung cancer patients.
KW - Big Data
KW - Biomedical Data
KW - Knowledge Graph
KW - Natural Language Processing
KW - Semantic Data Integration
UR - http://www.scopus.com/inward/record.url?scp=85070971867&partnerID=8YFLogxK
U2 - 10.1109/CBMS.2019.00116
DO - 10.1109/CBMS.2019.00116
M3 - Conference contribution
AN - SCOPUS:85070971867
SN - 978-1-7281-2287-8
T3 - Proceedings - IEEE Symposium on Computer-Based Medical Systems
SP - 563
EP - 566
BT - 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems, CBMS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 32nd IEEE International Symposium on Computer-Based Medical Systems, CBMS 2019
Y2 - 5 June 2019 through 7 June 2019
ER -