Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Matthäus Zloch
  • Maribel Acosta
  • Daniel Hienert
  • Stefan Conrad
  • Stefan Dietze

External Research Organisations

  • University Hospital Düsseldorf
  • Ruhr-Universität Bochum
  • GESIS - Leibniz Institute for the Social Sciences
  • Karlsruhe Institute of Technology (KIT)
View graph of relations

Details

Original languageEnglish
Pages (from-to)789-812
Number of pages24
JournalSemantic Web
Volume12
Issue number5
Early online date7 Jul 2020
Publication statusPublished - 2021
Externally publishedYes

Abstract

The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

Keywords

    graph measures, graph topology, measure assessment, RDF graph, RDF graph profiling

ASJC Scopus subject areas

Cite this

Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment. / Zloch, Matthäus; Acosta, Maribel; Hienert, Daniel et al.
In: Semantic Web, Vol. 12, No. 5, 2021, p. 789-812.

Research output: Contribution to journalArticleResearchpeer review

Zloch, M, Acosta, M, Hienert, D, Conrad, S & Dietze, S 2021, 'Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment', Semantic Web, vol. 12, no. 5, pp. 789-812. https://doi.org/10.3233/SW-200409
Zloch, M., Acosta, M., Hienert, D., Conrad, S., & Dietze, S. (2021). Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment. Semantic Web, 12(5), 789-812. https://doi.org/10.3233/SW-200409
Zloch M, Acosta M, Hienert D, Conrad S, Dietze S. Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment. Semantic Web. 2021;12(5):789-812. Epub 2020 Jul 7. doi: 10.3233/SW-200409
Zloch, Matthäus ; Acosta, Maribel ; Hienert, Daniel et al. / Charaterizing RDF Graphs through Graph-based Measures : Framework and Assessment. In: Semantic Web. 2021 ; Vol. 12, No. 5. pp. 789-812.
Download
@article{1ffcf96f1fdc4f748a9c893a86696194,
title = "Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment",
abstract = "The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.",
keywords = "graph measures, graph topology, measure assessment, RDF graph, RDF graph profiling",
author = "Matth{\"a}us Zloch and Maribel Acosta and Daniel Hienert and Stefan Conrad and Stefan Dietze",
note = "Publisher Copyright: {\textcopyright} 2021 - The authors. Published by IOS Press.",
year = "2021",
doi = "10.3233/SW-200409",
language = "English",
volume = "12",
pages = "789--812",
journal = "Semantic Web",
issn = "1570-0844",
publisher = "IOS Press",
number = "5",

}

Download

TY - JOUR

T1 - Charaterizing RDF Graphs through Graph-based Measures

T2 - Framework and Assessment

AU - Zloch, Matthäus

AU - Acosta, Maribel

AU - Hienert, Daniel

AU - Conrad, Stefan

AU - Dietze, Stefan

N1 - Publisher Copyright: © 2021 - The authors. Published by IOS Press.

PY - 2021

Y1 - 2021

N2 - The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

AB - The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

KW - graph measures

KW - graph topology

KW - measure assessment

KW - RDF graph

KW - RDF graph profiling

UR - http://www.scopus.com/inward/record.url?scp=85113925797&partnerID=8YFLogxK

U2 - 10.3233/SW-200409

DO - 10.3233/SW-200409

M3 - Article

VL - 12

SP - 789

EP - 812

JO - Semantic Web

JF - Semantic Web

SN - 1570-0844

IS - 5

ER -