Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Thanos Yannakis
  • Pavlos Fafalios
  • Yannis Tzitzikas

Research Organisations

External Research Organisations

  • University of Crete
  • ICS-FORTH
View graph of relations

Details

Original languageEnglish
Title of host publicationGeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018
Subtitle of host publicationProceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018)
Pages74-88
Number of pages15
Publication statusPublished - 2018
Event3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data, GeoLD-QuWeDa 2018 - Heraklion, Greece
Duration: 3 Jun 20184 Jun 2018

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR Workshop Proceedings
Volume2110
ISSN (Print)1613-0073

Abstract

The federated query extension of SPARQL 1.1 allows executing queries distributed over different SPARQL endpoints. SPARQL-LD is a recent extension of SPARQL 1.1 which enables to directly query any HTTP web source containing RDF data, like web pages embedded with RDFa, JSON-LD or Microformats, without requiring the declaration of named graphs. This makes possible to query a large number of data sources (including SPARQL endpoints, online resources, or even Web APIs returning RDF data) through a single one concise query. However, not optimal formulation of SPARQL 1.1 and SPARQL-LD queries can lead to a large number of calls to remote resources which in turn can lead to extremely high query execution times. In this paper, we address this problem and propose a set of query reordering methods which make use of heuristics to reorder a set of service graph patterns based on their restrictiveness, without requiring the gathering and use of statistics from the remote sources. Such a query optimization approach is widely applicable since it can be exploited on top of existing SPARQL 1.1 and SPARQL-LD implementations. Evaluation results show that query reordering can highly decrease the query-execution time, while a method that considers the number and type of unbound variables and joins achieves the optimal query plan in 88% of the cases.

Keywords

    Linked data, Query reordering, SPARQL 1.1, SPARQL-LD

ASJC Scopus subject areas

Cite this

Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD. / Yannakis, Thanos; Fafalios, Pavlos; Tzitzikas, Yannis.
GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018: Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018). 2018. p. 74-88 (CEUR Workshop Proceedings; Vol. 2110).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Yannakis, T, Fafalios, P & Tzitzikas, Y 2018, Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD. in GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018: Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018). CEUR Workshop Proceedings, vol. 2110, pp. 74-88, 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data, GeoLD-QuWeDa 2018, Heraklion, Greece, 3 Jun 2018. <http://ceur-ws.org/Vol-2110/>
Yannakis, T., Fafalios, P., & Tzitzikas, Y. (2018). Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD. In GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018: Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018) (pp. 74-88). (CEUR Workshop Proceedings; Vol. 2110). http://ceur-ws.org/Vol-2110/
Yannakis T, Fafalios P, Tzitzikas Y. Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD. In GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018: Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018). 2018. p. 74-88. (CEUR Workshop Proceedings).
Yannakis, Thanos ; Fafalios, Pavlos ; Tzitzikas, Yannis. / Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD. GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018: Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018). 2018. pp. 74-88 (CEUR Workshop Proceedings).
Download
@inproceedings{7f3afb4a5e4a4febbca15ee8604a1fb5,
title = "Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD",
abstract = "The federated query extension of SPARQL 1.1 allows executing queries distributed over different SPARQL endpoints. SPARQL-LD is a recent extension of SPARQL 1.1 which enables to directly query any HTTP web source containing RDF data, like web pages embedded with RDFa, JSON-LD or Microformats, without requiring the declaration of named graphs. This makes possible to query a large number of data sources (including SPARQL endpoints, online resources, or even Web APIs returning RDF data) through a single one concise query. However, not optimal formulation of SPARQL 1.1 and SPARQL-LD queries can lead to a large number of calls to remote resources which in turn can lead to extremely high query execution times. In this paper, we address this problem and propose a set of query reordering methods which make use of heuristics to reorder a set of service graph patterns based on their restrictiveness, without requiring the gathering and use of statistics from the remote sources. Such a query optimization approach is widely applicable since it can be exploited on top of existing SPARQL 1.1 and SPARQL-LD implementations. Evaluation results show that query reordering can highly decrease the query-execution time, while a method that considers the number and type of unbound variables and joins achieves the optimal query plan in 88% of the cases.",
keywords = "Linked data, Query reordering, SPARQL 1.1, SPARQL-LD",
author = "Thanos Yannakis and Pavlos Fafalios and Yannis Tzitzikas",
note = "Funding Information: The work was partially funded by the European Commission for the ERC Advanced Grant ALEXANDRIA under grant No. 339233.; 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data, GeoLD-QuWeDa 2018 ; Conference date: 03-06-2018 Through 04-06-2018",
year = "2018",
language = "English",
series = "CEUR Workshop Proceedings",
publisher = "CEUR Workshop Proceedings",
pages = "74--88",
booktitle = "GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018",

}

Download

TY - GEN

T1 - Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD

AU - Yannakis, Thanos

AU - Fafalios, Pavlos

AU - Tzitzikas, Yannis

N1 - Funding Information: The work was partially funded by the European Commission for the ERC Advanced Grant ALEXANDRIA under grant No. 339233.

PY - 2018

Y1 - 2018

N2 - The federated query extension of SPARQL 1.1 allows executing queries distributed over different SPARQL endpoints. SPARQL-LD is a recent extension of SPARQL 1.1 which enables to directly query any HTTP web source containing RDF data, like web pages embedded with RDFa, JSON-LD or Microformats, without requiring the declaration of named graphs. This makes possible to query a large number of data sources (including SPARQL endpoints, online resources, or even Web APIs returning RDF data) through a single one concise query. However, not optimal formulation of SPARQL 1.1 and SPARQL-LD queries can lead to a large number of calls to remote resources which in turn can lead to extremely high query execution times. In this paper, we address this problem and propose a set of query reordering methods which make use of heuristics to reorder a set of service graph patterns based on their restrictiveness, without requiring the gathering and use of statistics from the remote sources. Such a query optimization approach is widely applicable since it can be exploited on top of existing SPARQL 1.1 and SPARQL-LD implementations. Evaluation results show that query reordering can highly decrease the query-execution time, while a method that considers the number and type of unbound variables and joins achieves the optimal query plan in 88% of the cases.

AB - The federated query extension of SPARQL 1.1 allows executing queries distributed over different SPARQL endpoints. SPARQL-LD is a recent extension of SPARQL 1.1 which enables to directly query any HTTP web source containing RDF data, like web pages embedded with RDFa, JSON-LD or Microformats, without requiring the declaration of named graphs. This makes possible to query a large number of data sources (including SPARQL endpoints, online resources, or even Web APIs returning RDF data) through a single one concise query. However, not optimal formulation of SPARQL 1.1 and SPARQL-LD queries can lead to a large number of calls to remote resources which in turn can lead to extremely high query execution times. In this paper, we address this problem and propose a set of query reordering methods which make use of heuristics to reorder a set of service graph patterns based on their restrictiveness, without requiring the gathering and use of statistics from the remote sources. Such a query optimization approach is widely applicable since it can be exploited on top of existing SPARQL 1.1 and SPARQL-LD implementations. Evaluation results show that query reordering can highly decrease the query-execution time, while a method that considers the number and type of unbound variables and joins achieves the optimal query plan in 88% of the cases.

KW - Linked data

KW - Query reordering

KW - SPARQL 1.1

KW - SPARQL-LD

UR - http://www.scopus.com/inward/record.url?scp=85049106183&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85049106183

T3 - CEUR Workshop Proceedings

SP - 74

EP - 88

BT - GeoLD-QuWeDa 2018 ESWC 2018 Workshops: GeoLD 2018 and QuWeDa 2018

T2 - 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data, GeoLD-QuWeDa 2018

Y2 - 3 June 2018 through 4 June 2018

ER -