Uniform Access to Multiform Data Lakes using Semantic Technologies

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Mohamed Nadjib Mami
  • Damien Graux
  • Simon Scerri
  • Hajira Jabeen
  • Soren Auer
  • Jens Lehmann

Externe Organisationen

  • Rheinische Friedrich-Wilhelms-Universität Bonn
  • Trinity College Dublin
  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
  • Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme (IAIS)
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings
Untertitel 21st International Conference on Information Integration and Web-Based Applications & Services
Herausgeber/-innenMaria Indrawan-Santiago, Eric Pardede, Ivan Luiz Salvadori, Matthias Steinbauer, Ismail Khalil, Gabriele Anderst-Kotsis
Herausgeber (Verlag)Association for Computing Machinery (ACM)
ISBN (elektronisch)9781450371797
PublikationsstatusVeröffentlicht - 2 Dez. 2019
Extern publiziertJa
Veranstaltung21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Munich, Deutschland
Dauer: 2 Dez. 20194 Dez. 2019

Publikationsreihe

NameACM International Conference Proceeding Series

Abstract

Increasing data volumes have extensively increased application possibilities. However, accessing this data in an ad hoc manner remains an unsolved problem due to the diversity of data management approaches, formats and storage frameworks, resulting in the need to effectively access and process distributed heterogeneous data at scale. For years, SemanticWeb techniques have addressed data integration challenges with practical knowledge representation models and ontology-based mappings. Leveraging these techniques, we provide a solution enabling uniform access to large, heterogeneous data sources, without enforcing centralization; thus realizing the vision of a Semantic Data Lake. In this paper, we define the core concepts underlying this vision and the architectural requirements that systems implementing it need to fulfill. Squerall, an example of such a system, is an extensible framework built on top of state-ofthe- A rt Big Data technologies. We focus on Squerall's distributed query execution techniques and strategies, empirically evaluating its performance throughout its various sub-phases.

ASJC Scopus Sachgebiete

Zitieren

Uniform Access to Multiform Data Lakes using Semantic Technologies. / Mami, Mohamed Nadjib; Graux, Damien; Scerri, Simon et al.
21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings: 21st International Conference on Information Integration and Web-Based Applications & Services. Hrsg. / Maria Indrawan-Santiago; Eric Pardede; Ivan Luiz Salvadori; Matthias Steinbauer; Ismail Khalil; Gabriele Anderst-Kotsis. Association for Computing Machinery (ACM), 2019. (ACM International Conference Proceeding Series).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Mami, MN, Graux, D, Scerri, S, Jabeen, H, Auer, S & Lehmann, J 2019, Uniform Access to Multiform Data Lakes using Semantic Technologies. in M Indrawan-Santiago, E Pardede, IL Salvadori, M Steinbauer, I Khalil & G Anderst-Kotsis (Hrsg.), 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings: 21st International Conference on Information Integration and Web-Based Applications & Services. ACM International Conference Proceeding Series, Association for Computing Machinery (ACM), 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019, Munich, Deutschland, 2 Dez. 2019. https://doi.org/10.1145/3366030.3366054
Mami, M. N., Graux, D., Scerri, S., Jabeen, H., Auer, S., & Lehmann, J. (2019). Uniform Access to Multiform Data Lakes using Semantic Technologies. In M. Indrawan-Santiago, E. Pardede, I. L. Salvadori, M. Steinbauer, I. Khalil, & G. Anderst-Kotsis (Hrsg.), 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings: 21st International Conference on Information Integration and Web-Based Applications & Services (ACM International Conference Proceeding Series). Association for Computing Machinery (ACM). https://doi.org/10.1145/3366030.3366054
Mami MN, Graux D, Scerri S, Jabeen H, Auer S, Lehmann J. Uniform Access to Multiform Data Lakes using Semantic Technologies. in Indrawan-Santiago M, Pardede E, Salvadori IL, Steinbauer M, Khalil I, Anderst-Kotsis G, Hrsg., 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings: 21st International Conference on Information Integration and Web-Based Applications & Services. Association for Computing Machinery (ACM). 2019. (ACM International Conference Proceeding Series). doi: 10.1145/3366030.3366054
Mami, Mohamed Nadjib ; Graux, Damien ; Scerri, Simon et al. / Uniform Access to Multiform Data Lakes using Semantic Technologies. 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings: 21st International Conference on Information Integration and Web-Based Applications & Services. Hrsg. / Maria Indrawan-Santiago ; Eric Pardede ; Ivan Luiz Salvadori ; Matthias Steinbauer ; Ismail Khalil ; Gabriele Anderst-Kotsis. Association for Computing Machinery (ACM), 2019. (ACM International Conference Proceeding Series).
Download
@inproceedings{b549a52b41674cefbd9ffaab195a1977,
title = "Uniform Access to Multiform Data Lakes using Semantic Technologies",
abstract = "Increasing data volumes have extensively increased application possibilities. However, accessing this data in an ad hoc manner remains an unsolved problem due to the diversity of data management approaches, formats and storage frameworks, resulting in the need to effectively access and process distributed heterogeneous data at scale. For years, SemanticWeb techniques have addressed data integration challenges with practical knowledge representation models and ontology-based mappings. Leveraging these techniques, we provide a solution enabling uniform access to large, heterogeneous data sources, without enforcing centralization; thus realizing the vision of a Semantic Data Lake. In this paper, we define the core concepts underlying this vision and the architectural requirements that systems implementing it need to fulfill. Squerall, an example of such a system, is an extensible framework built on top of state-ofthe- A rt Big Data technologies. We focus on Squerall's distributed query execution techniques and strategies, empirically evaluating its performance throughout its various sub-phases.",
keywords = "Big Data, Data Variety, NoSQL, Semantic Data Lake, SPARQL",
author = "Mami, {Mohamed Nadjib} and Damien Graux and Simon Scerri and Hajira Jabeen and Soren Auer and Jens Lehmann",
note = "Funding Information: This work is partly supported by the EU H2020 projects BETTER (GA 776280) and QualiChain (GA 822404); and by the ADAPT Centre for Digital Content Technology funded under the SFI Research Centres Programme (Grant 13/RC/2106) and co-funded under the European Regional Development Fund.; 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 ; Conference date: 02-12-2019 Through 04-12-2019",
year = "2019",
month = dec,
day = "2",
doi = "10.1145/3366030.3366054",
language = "English",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery (ACM)",
editor = "Maria Indrawan-Santiago and Eric Pardede and Salvadori, {Ivan Luiz} and Matthias Steinbauer and Ismail Khalil and Gabriele Anderst-Kotsis",
booktitle = "21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings",
address = "United States",

}

Download

TY - GEN

T1 - Uniform Access to Multiform Data Lakes using Semantic Technologies

AU - Mami, Mohamed Nadjib

AU - Graux, Damien

AU - Scerri, Simon

AU - Jabeen, Hajira

AU - Auer, Soren

AU - Lehmann, Jens

N1 - Funding Information: This work is partly supported by the EU H2020 projects BETTER (GA 776280) and QualiChain (GA 822404); and by the ADAPT Centre for Digital Content Technology funded under the SFI Research Centres Programme (Grant 13/RC/2106) and co-funded under the European Regional Development Fund.

PY - 2019/12/2

Y1 - 2019/12/2

N2 - Increasing data volumes have extensively increased application possibilities. However, accessing this data in an ad hoc manner remains an unsolved problem due to the diversity of data management approaches, formats and storage frameworks, resulting in the need to effectively access and process distributed heterogeneous data at scale. For years, SemanticWeb techniques have addressed data integration challenges with practical knowledge representation models and ontology-based mappings. Leveraging these techniques, we provide a solution enabling uniform access to large, heterogeneous data sources, without enforcing centralization; thus realizing the vision of a Semantic Data Lake. In this paper, we define the core concepts underlying this vision and the architectural requirements that systems implementing it need to fulfill. Squerall, an example of such a system, is an extensible framework built on top of state-ofthe- A rt Big Data technologies. We focus on Squerall's distributed query execution techniques and strategies, empirically evaluating its performance throughout its various sub-phases.

AB - Increasing data volumes have extensively increased application possibilities. However, accessing this data in an ad hoc manner remains an unsolved problem due to the diversity of data management approaches, formats and storage frameworks, resulting in the need to effectively access and process distributed heterogeneous data at scale. For years, SemanticWeb techniques have addressed data integration challenges with practical knowledge representation models and ontology-based mappings. Leveraging these techniques, we provide a solution enabling uniform access to large, heterogeneous data sources, without enforcing centralization; thus realizing the vision of a Semantic Data Lake. In this paper, we define the core concepts underlying this vision and the architectural requirements that systems implementing it need to fulfill. Squerall, an example of such a system, is an extensible framework built on top of state-ofthe- A rt Big Data technologies. We focus on Squerall's distributed query execution techniques and strategies, empirically evaluating its performance throughout its various sub-phases.

KW - Big Data

KW - Data Variety

KW - NoSQL

KW - Semantic Data Lake

KW - SPARQL

UR - http://www.scopus.com/inward/record.url?scp=85117539584&partnerID=8YFLogxK

U2 - 10.1145/3366030.3366054

DO - 10.1145/3366030.3366054

M3 - Conference contribution

T3 - ACM International Conference Proceeding Series

BT - 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings

A2 - Indrawan-Santiago, Maria

A2 - Pardede, Eric

A2 - Salvadori, Ivan Luiz

A2 - Steinbauer, Matthias

A2 - Khalil, Ismail

A2 - Anderst-Kotsis, Gabriele

PB - Association for Computing Machinery (ACM)

T2 - 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019

Y2 - 2 December 2019 through 4 December 2019

ER -

Von denselben Autoren