Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | CIKM 2020 |
Untertitel | Proceedings of the 29th ACM International Conference on Information and Knowledge Management |
Herausgeber (Verlag) | Association for Computing Machinery (ACM) |
Seiten | 3039-3046 |
Seitenumfang | 8 |
ISBN (elektronisch) | 9781450368599 |
Publikationsstatus | Veröffentlicht - 19 Okt. 2020 |
Veranstaltung | 29th ACM International Conference on Information and Knowledge Management - online, Virtual, Online, Irland Dauer: 19 Okt. 2020 → 23 Okt. 2020 |
Publikationsreihe
Name | International Conference on Information and Knowledge Management, Proceedings |
---|
Abstract
In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.
ASJC Scopus Sachgebiete
- Betriebswirtschaft, Management und Rechnungswesen (insg.)
- Allgemeine Unternehmensführung und Buchhaltung
- Entscheidungswissenschaften (insg.)
- Allgemeine Entscheidungswissenschaften
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
CIKM 2020: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery (ACM), 2020. S. 3039-3046 (International Conference on Information and Knowledge Management, Proceedings).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - SDM-RDFizer
T2 - 29th ACM International Conference on Information and Knowledge Management, CIKM 2020
AU - Iglesias, Enrique
AU - Jozashoori, Samaneh
AU - Chaves-Fraga, David
AU - Collarana, Diego
AU - Vidal, Maria Esther
N1 - Funding Information: This work has been partially supported by the EU H2020 RIA funded project iASiS with grant agreement No 727658, by Ministerio de Economía, Industria y Competitividad and EU FEDER funds under the DATOS 4.0: RETOS Y SOLUCIONES - UPM Spanish national project (TIN2016-78011-C4-4-R) and by the FPI grant (BES-2017-082511).
PY - 2020/10/19
Y1 - 2020/10/19
N2 - In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.
AB - In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.
KW - knowledge graph
KW - rdf
KW - rml
UR - http://www.scopus.com/inward/record.url?scp=85095542622&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2008.07176
DO - 10.48550/arXiv.2008.07176
M3 - Conference contribution
AN - SCOPUS:85095542622
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 3039
EP - 3046
BT - CIKM 2020
PB - Association for Computing Machinery (ACM)
Y2 - 19 October 2020 through 23 October 2020
ER -