Efficient Scalable Temporal Web Graph Store

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

External Research Organisations

  • University of Bonn
  • Chongqing Institute of Technology
  • Robert Bosch GmbH
View graph of relations

Details

Original languageEnglish
Title of host publication2021 IEEE International Conference on Big Data (Big Data)
EditorsYixin Chen, Heiko Ludwig, Yicheng Tu, Usama Fayyad, Xingquan Zhu, Xiaohua Tony Hu, Suren Byna, Xiong Liu, Jianping Zhang, Shirui Pan, Vagelis Papalexakis, Jianwu Wang, Alfredo Cuzzocrea, Carlos Ordonez
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages263-273
Number of pages11
ISBN (electronic)9781665439022
ISBN (print)978-1-6654-4599-3
Publication statusPublished - 2021
Event2021 IEEE International Conference on Big Data, Big Data 2021 - Virtual, Online, United States
Duration: 15 Dec 202118 Dec 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021

Abstract

Temporal web graphs have been attracting much attention recently due to their important applications in web search, data mining, and social network analysis. Accumulated over long periods, those graphs have grown gigantic in size and rich in temporal evolution, which poses tough challenges for data storage and management. Though a few temporal graph management systems were previously proposed, none of them can simultaneously satisfy both essential requirements when retrieving on temporal web graphs: very large data scalability and very low querying latency.In this work, we address the above gap in existing works by developing a highly efficient temporal graph management system which is dedicated to web graphs. To this end, we greatly extend the most efficient framework for managing large static web graphs to handle temporal information using the property matrix while preserving most of the outstanding features of the base framework. Ultimately, our proposed system can achieve a nearly instant response for vertex-centric temporal retrieval while still being scalable to huge datasets. Experiments on a real-world dataset with more than 43B nodes and 317B links show that using a small non-dedicated cluster, our system can reach a reduction of data storage space up to 88% of raw data size and reduce the retrieval time by 20%, compared to the baselines. We also demonstrate that our system also yields a significant reduction of computational costs for many graph ranking algorithms.

Keywords

    archival search, compression, distributed system, graph index, temporal graph representation

ASJC Scopus subject areas

Cite this

Efficient Scalable Temporal Web Graph Store. / Vo, Khoi Duy; Zerr, Sergej; Zhu, Xiaofei et al.
2021 IEEE International Conference on Big Data (Big Data). ed. / Yixin Chen; Heiko Ludwig; Yicheng Tu; Usama Fayyad; Xingquan Zhu; Xiaohua Tony Hu; Suren Byna; Xiong Liu; Jianping Zhang; Shirui Pan; Vagelis Papalexakis; Jianwu Wang; Alfredo Cuzzocrea; Carlos Ordonez. Institute of Electrical and Electronics Engineers Inc., 2021. p. 263-273 (Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Vo, KD, Zerr, S, Zhu, X & Nejdl, W 2021, Efficient Scalable Temporal Web Graph Store. in Y Chen, H Ludwig, Y Tu, U Fayyad, X Zhu, XT Hu, S Byna, X Liu, J Zhang, S Pan, V Papalexakis, J Wang, A Cuzzocrea & C Ordonez (eds), 2021 IEEE International Conference on Big Data (Big Data). Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021, Institute of Electrical and Electronics Engineers Inc., pp. 263-273, 2021 IEEE International Conference on Big Data, Big Data 2021, Virtual, Online, United States, 15 Dec 2021. https://doi.org/10.1109/bigdata52589.2021.9671984
Vo, K. D., Zerr, S., Zhu, X., & Nejdl, W. (2021). Efficient Scalable Temporal Web Graph Store. In Y. Chen, H. Ludwig, Y. Tu, U. Fayyad, X. Zhu, X. T. Hu, S. Byna, X. Liu, J. Zhang, S. Pan, V. Papalexakis, J. Wang, A. Cuzzocrea, & C. Ordonez (Eds.), 2021 IEEE International Conference on Big Data (Big Data) (pp. 263-273). (Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/bigdata52589.2021.9671984
Vo KD, Zerr S, Zhu X, Nejdl W. Efficient Scalable Temporal Web Graph Store. In Chen Y, Ludwig H, Tu Y, Fayyad U, Zhu X, Hu XT, Byna S, Liu X, Zhang J, Pan S, Papalexakis V, Wang J, Cuzzocrea A, Ordonez C, editors, 2021 IEEE International Conference on Big Data (Big Data). Institute of Electrical and Electronics Engineers Inc. 2021. p. 263-273. (Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021). doi: 10.1109/bigdata52589.2021.9671984
Vo, Khoi Duy ; Zerr, Sergej ; Zhu, Xiaofei et al. / Efficient Scalable Temporal Web Graph Store. 2021 IEEE International Conference on Big Data (Big Data). editor / Yixin Chen ; Heiko Ludwig ; Yicheng Tu ; Usama Fayyad ; Xingquan Zhu ; Xiaohua Tony Hu ; Suren Byna ; Xiong Liu ; Jianping Zhang ; Shirui Pan ; Vagelis Papalexakis ; Jianwu Wang ; Alfredo Cuzzocrea ; Carlos Ordonez. Institute of Electrical and Electronics Engineers Inc., 2021. pp. 263-273 (Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021).
Download
@inproceedings{884125f171c542339f7aaeacd97163bc,
title = "Efficient Scalable Temporal Web Graph Store",
abstract = "Temporal web graphs have been attracting much attention recently due to their important applications in web search, data mining, and social network analysis. Accumulated over long periods, those graphs have grown gigantic in size and rich in temporal evolution, which poses tough challenges for data storage and management. Though a few temporal graph management systems were previously proposed, none of them can simultaneously satisfy both essential requirements when retrieving on temporal web graphs: very large data scalability and very low querying latency.In this work, we address the above gap in existing works by developing a highly efficient temporal graph management system which is dedicated to web graphs. To this end, we greatly extend the most efficient framework for managing large static web graphs to handle temporal information using the property matrix while preserving most of the outstanding features of the base framework. Ultimately, our proposed system can achieve a nearly instant response for vertex-centric temporal retrieval while still being scalable to huge datasets. Experiments on a real-world dataset with more than 43B nodes and 317B links show that using a small non-dedicated cluster, our system can reach a reduction of data storage space up to 88% of raw data size and reduce the retrieval time by 20%, compared to the baselines. We also demonstrate that our system also yields a significant reduction of computational costs for many graph ranking algorithms.",
keywords = "archival search, compression, distributed system, graph index, temporal graph representation",
author = "Vo, {Khoi Duy} and Sergej Zerr and Xiaofei Zhu and Wolfgang Nejdl",
year = "2021",
doi = "10.1109/bigdata52589.2021.9671984",
language = "English",
isbn = "978-1-6654-4599-3",
series = "Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "263--273",
editor = "Yixin Chen and Heiko Ludwig and Yicheng Tu and Usama Fayyad and Xingquan Zhu and Hu, {Xiaohua Tony} and Suren Byna and Xiong Liu and Jianping Zhang and Shirui Pan and Vagelis Papalexakis and Jianwu Wang and Alfredo Cuzzocrea and Carlos Ordonez",
booktitle = "2021 IEEE International Conference on Big Data (Big Data)",
address = "United States",
note = "2021 IEEE International Conference on Big Data, Big Data 2021 ; Conference date: 15-12-2021 Through 18-12-2021",

}

Download

TY - GEN

T1 - Efficient Scalable Temporal Web Graph Store

AU - Vo, Khoi Duy

AU - Zerr, Sergej

AU - Zhu, Xiaofei

AU - Nejdl, Wolfgang

PY - 2021

Y1 - 2021

N2 - Temporal web graphs have been attracting much attention recently due to their important applications in web search, data mining, and social network analysis. Accumulated over long periods, those graphs have grown gigantic in size and rich in temporal evolution, which poses tough challenges for data storage and management. Though a few temporal graph management systems were previously proposed, none of them can simultaneously satisfy both essential requirements when retrieving on temporal web graphs: very large data scalability and very low querying latency.In this work, we address the above gap in existing works by developing a highly efficient temporal graph management system which is dedicated to web graphs. To this end, we greatly extend the most efficient framework for managing large static web graphs to handle temporal information using the property matrix while preserving most of the outstanding features of the base framework. Ultimately, our proposed system can achieve a nearly instant response for vertex-centric temporal retrieval while still being scalable to huge datasets. Experiments on a real-world dataset with more than 43B nodes and 317B links show that using a small non-dedicated cluster, our system can reach a reduction of data storage space up to 88% of raw data size and reduce the retrieval time by 20%, compared to the baselines. We also demonstrate that our system also yields a significant reduction of computational costs for many graph ranking algorithms.

AB - Temporal web graphs have been attracting much attention recently due to their important applications in web search, data mining, and social network analysis. Accumulated over long periods, those graphs have grown gigantic in size and rich in temporal evolution, which poses tough challenges for data storage and management. Though a few temporal graph management systems were previously proposed, none of them can simultaneously satisfy both essential requirements when retrieving on temporal web graphs: very large data scalability and very low querying latency.In this work, we address the above gap in existing works by developing a highly efficient temporal graph management system which is dedicated to web graphs. To this end, we greatly extend the most efficient framework for managing large static web graphs to handle temporal information using the property matrix while preserving most of the outstanding features of the base framework. Ultimately, our proposed system can achieve a nearly instant response for vertex-centric temporal retrieval while still being scalable to huge datasets. Experiments on a real-world dataset with more than 43B nodes and 317B links show that using a small non-dedicated cluster, our system can reach a reduction of data storage space up to 88% of raw data size and reduce the retrieval time by 20%, compared to the baselines. We also demonstrate that our system also yields a significant reduction of computational costs for many graph ranking algorithms.

KW - archival search

KW - compression

KW - distributed system

KW - graph index

KW - temporal graph representation

UR - http://www.scopus.com/inward/record.url?scp=85125298267&partnerID=8YFLogxK

U2 - 10.1109/bigdata52589.2021.9671984

DO - 10.1109/bigdata52589.2021.9671984

M3 - Conference contribution

AN - SCOPUS:85125298267

SN - 978-1-6654-4599-3

T3 - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021

SP - 263

EP - 273

BT - 2021 IEEE International Conference on Big Data (Big Data)

A2 - Chen, Yixin

A2 - Ludwig, Heiko

A2 - Tu, Yicheng

A2 - Fayyad, Usama

A2 - Zhu, Xingquan

A2 - Hu, Xiaohua Tony

A2 - Byna, Suren

A2 - Liu, Xiong

A2 - Zhang, Jianping

A2 - Pan, Shirui

A2 - Papalexakis, Vagelis

A2 - Wang, Jianwu

A2 - Cuzzocrea, Alfredo

A2 - Ordonez, Carlos

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2021 IEEE International Conference on Big Data, Big Data 2021

Y2 - 15 December 2021 through 18 December 2021

ER -

By the same author(s)