SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Enrique Iglesias
  • Samaneh Jozashoori
  • David Chaves-Fraga
  • Diego Collarana
  • Maria Esther Vidal

Organisationseinheiten

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
  • Universidad Politécnica de Madrid (UPM)
  • Rheinische Friedrich-Wilhelms-Universität Bonn
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksCIKM 2020
UntertitelProceedings of the 29th ACM International Conference on Information and Knowledge Management
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten3039-3046
Seitenumfang8
ISBN (elektronisch)9781450368599
PublikationsstatusVeröffentlicht - 19 Okt. 2020
Veranstaltung29th ACM International Conference on Information and Knowledge Management - online, Virtual, Online, Irland
Dauer: 19 Okt. 202023 Okt. 2020

Publikationsreihe

NameInternational Conference on Information and Knowledge Management, Proceedings

Abstract

In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.

Zitieren

SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs. / Iglesias, Enrique; Jozashoori, Samaneh; Chaves-Fraga, David et al.
CIKM 2020: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery (ACM), 2020. S. 3039-3046 (International Conference on Information and Knowledge Management, Proceedings).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Iglesias, E, Jozashoori, S, Chaves-Fraga, D, Collarana, D & Vidal, ME 2020, SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs. in CIKM 2020: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. International Conference on Information and Knowledge Management, Proceedings, Association for Computing Machinery (ACM), S. 3039-3046, 29th ACM International Conference on Information and Knowledge Management, Virtual, Online, Irland, 19 Okt. 2020. https://doi.org/10.48550/arXiv.2008.07176, https://doi.org/10.1145/3340531.3412881
Iglesias, E., Jozashoori, S., Chaves-Fraga, D., Collarana, D., & Vidal, M. E. (2020). SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs. In CIKM 2020: Proceedings of the 29th ACM International Conference on Information and Knowledge Management (S. 3039-3046). (International Conference on Information and Knowledge Management, Proceedings). Association for Computing Machinery (ACM). https://doi.org/10.48550/arXiv.2008.07176, https://doi.org/10.1145/3340531.3412881
Iglesias E, Jozashoori S, Chaves-Fraga D, Collarana D, Vidal ME. SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs. in CIKM 2020: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery (ACM). 2020. S. 3039-3046. (International Conference on Information and Knowledge Management, Proceedings). doi: 10.48550/arXiv.2008.07176, 10.1145/3340531.3412881
Iglesias, Enrique ; Jozashoori, Samaneh ; Chaves-Fraga, David et al. / SDM-RDFizer : An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs. CIKM 2020: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery (ACM), 2020. S. 3039-3046 (International Conference on Information and Knowledge Management, Proceedings).
Download
@inproceedings{eb9e45b6b9c94d1fa3478f451934f80c,
title = "SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs",
abstract = "In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.",
keywords = "knowledge graph, rdf, rml",
author = "Enrique Iglesias and Samaneh Jozashoori and David Chaves-Fraga and Diego Collarana and Vidal, {Maria Esther}",
note = "Funding Information: This work has been partially supported by the EU H2020 RIA funded project iASiS with grant agreement No 727658, by Ministerio de Econom{\'i}a, Industria y Competitividad and EU FEDER funds under the DATOS 4.0: RETOS Y SOLUCIONES - UPM Spanish national project (TIN2016-78011-C4-4-R) and by the FPI grant (BES-2017-082511).; 29th ACM International Conference on Information and Knowledge Management, CIKM 2020 ; Conference date: 19-10-2020 Through 23-10-2020",
year = "2020",
month = oct,
day = "19",
doi = "10.48550/arXiv.2008.07176",
language = "English",
series = "International Conference on Information and Knowledge Management, Proceedings",
publisher = "Association for Computing Machinery (ACM)",
pages = "3039--3046",
booktitle = "CIKM 2020",
address = "United States",

}

Download

TY - GEN

T1 - SDM-RDFizer

T2 - 29th ACM International Conference on Information and Knowledge Management, CIKM 2020

AU - Iglesias, Enrique

AU - Jozashoori, Samaneh

AU - Chaves-Fraga, David

AU - Collarana, Diego

AU - Vidal, Maria Esther

N1 - Funding Information: This work has been partially supported by the EU H2020 RIA funded project iASiS with grant agreement No 727658, by Ministerio de Economía, Industria y Competitividad and EU FEDER funds under the DATOS 4.0: RETOS Y SOLUCIONES - UPM Spanish national project (TIN2016-78011-C4-4-R) and by the FPI grant (BES-2017-082511).

PY - 2020/10/19

Y1 - 2020/10/19

N2 - In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.

AB - In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.

KW - knowledge graph

KW - rdf

KW - rml

UR - http://www.scopus.com/inward/record.url?scp=85095542622&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2008.07176

DO - 10.48550/arXiv.2008.07176

M3 - Conference contribution

AN - SCOPUS:85095542622

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 3039

EP - 3046

BT - CIKM 2020

PB - Association for Computing Machinery (ACM)

Y2 - 19 October 2020 through 23 October 2020

ER -