Details
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 789-812 |
Seitenumfang | 24 |
Fachzeitschrift | Semantic Web |
Jahrgang | 12 |
Ausgabenummer | 5 |
Frühes Online-Datum | 7 Juli 2020 |
Publikationsstatus | Veröffentlicht - 2021 |
Extern publiziert | Ja |
Abstract
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Information systems
- Informatik (insg.)
- Computernetzwerke und -kommunikation
- Informatik (insg.)
- Angewandte Informatik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: Semantic Web, Jahrgang 12, Nr. 5, 2021, S. 789-812.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Charaterizing RDF Graphs through Graph-based Measures
T2 - Framework and Assessment
AU - Zloch, Matthäus
AU - Acosta, Maribel
AU - Hienert, Daniel
AU - Conrad, Stefan
AU - Dietze, Stefan
N1 - Publisher Copyright: © 2021 - The authors. Published by IOS Press.
PY - 2021
Y1 - 2021
N2 - The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.
AB - The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.
KW - graph measures
KW - graph topology
KW - measure assessment
KW - RDF graph
KW - RDF graph profiling
UR - http://www.scopus.com/inward/record.url?scp=85113925797&partnerID=8YFLogxK
U2 - 10.3233/SW-200409
DO - 10.3233/SW-200409
M3 - Article
VL - 12
SP - 789
EP - 812
JO - Semantic Web
JF - Semantic Web
SN - 1570-0844
IS - 5
ER -