Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

  • Matthäus Zloch
  • Maribel Acosta
  • Daniel Hienert
  • Stefan Conrad
  • Stefan Dietze

Externe Organisationen

  • Universitätsklinikum Düsseldorf
  • Ruhr-Universität Bochum
  • GESIS - Leibniz-Institut für Sozialwissenschaften
  • Karlsruher Institut für Technologie (KIT)
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)789-812
Seitenumfang24
FachzeitschriftSemantic Web
Jahrgang12
Ausgabenummer5
Frühes Online-Datum7 Juli 2020
PublikationsstatusVeröffentlicht - 2021
Extern publiziertJa

Abstract

The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

ASJC Scopus Sachgebiete

Zitieren

Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment. / Zloch, Matthäus; Acosta, Maribel; Hienert, Daniel et al.
in: Semantic Web, Jahrgang 12, Nr. 5, 2021, S. 789-812.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Zloch, M, Acosta, M, Hienert, D, Conrad, S & Dietze, S 2021, 'Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment', Semantic Web, Jg. 12, Nr. 5, S. 789-812. https://doi.org/10.3233/SW-200409
Zloch, M., Acosta, M., Hienert, D., Conrad, S., & Dietze, S. (2021). Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment. Semantic Web, 12(5), 789-812. https://doi.org/10.3233/SW-200409
Zloch M, Acosta M, Hienert D, Conrad S, Dietze S. Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment. Semantic Web. 2021;12(5):789-812. Epub 2020 Jul 7. doi: 10.3233/SW-200409
Zloch, Matthäus ; Acosta, Maribel ; Hienert, Daniel et al. / Charaterizing RDF Graphs through Graph-based Measures : Framework and Assessment. in: Semantic Web. 2021 ; Jahrgang 12, Nr. 5. S. 789-812.
Download
@article{1ffcf96f1fdc4f748a9c893a86696194,
title = "Charaterizing RDF Graphs through Graph-based Measures: Framework and Assessment",
abstract = "The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.",
keywords = "graph measures, graph topology, measure assessment, RDF graph, RDF graph profiling",
author = "Matth{\"a}us Zloch and Maribel Acosta and Daniel Hienert and Stefan Conrad and Stefan Dietze",
note = "Publisher Copyright: {\textcopyright} 2021 - The authors. Published by IOS Press.",
year = "2021",
doi = "10.3233/SW-200409",
language = "English",
volume = "12",
pages = "789--812",
journal = "Semantic Web",
issn = "1570-0844",
publisher = "IOS Press",
number = "5",

}

Download

TY - JOUR

T1 - Charaterizing RDF Graphs through Graph-based Measures

T2 - Framework and Assessment

AU - Zloch, Matthäus

AU - Acosta, Maribel

AU - Hienert, Daniel

AU - Conrad, Stefan

AU - Dietze, Stefan

N1 - Publisher Copyright: © 2021 - The authors. Published by IOS Press.

PY - 2021

Y1 - 2021

N2 - The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

AB - The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

KW - graph measures

KW - graph topology

KW - measure assessment

KW - RDF graph

KW - RDF graph profiling

UR - http://www.scopus.com/inward/record.url?scp=85113925797&partnerID=8YFLogxK

U2 - 10.3233/SW-200409

DO - 10.3233/SW-200409

M3 - Article

VL - 12

SP - 789

EP - 812

JO - Semantic Web

JF - Semantic Web

SN - 1570-0844

IS - 5

ER -