On the Applicability of Delicious for Temporal Searchon Web Archives

Helge Holzmann; Wolfgang Nejdl; Avishek Anand

doi:10.1145/2911451.2914724

Details

Original language	English
Title of host publication	SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval
Pages	929-932
Number of pages	4
ISBN (electronic)	9781450342902
Publication status	Published - 7 Jul 2016
Event	39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016 - Pisa, Italy Duration: 17 Jul 2016 → 21 Jul 2016

Publication series

Name	SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval

Abstract

Web archives are large longitudinal collections that store webpages from the past, which might be missing on the current live Web. Consequently, temporal search over such collections is essential for finding prominent missing webpages and tasks like historical analysis. However, this has been challenging due to the lack of popularity information and proper ground truth to evaluate temporal retrieval models. In this paper we investigate the applicability of external longitudinal resources to identify important and popular websites in the past and analyze the social bookmarking service Delicious for this purpose. The timestamped bookmarks on Delicious provide explicit cues about popular time periods in the past along with relevant descriptors. These are valuable to identify important documents in the past for a given temporal query. Focusing purely on recall, we analyzed more than 12,000 queries and find that using Delicious yields average recall values from 46% up to 100%, when limiting ourselves to the best represented queries in the considered dataset. This constitutes an attractive and low-overhead approach for quick access into Web archives by not dealing with the actual contents.

ASJC Scopus subject areas

Computer Science(all)
Information Systems
Computer Science(all)
Software

Cite this

On the Applicability of Delicious for Temporal Searchon Web Archives. / Holzmann, Helge; Nejdl, Wolfgang; Anand, Avishek.
SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016. p. 929-932 (SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Holzmann, H, Nejdl, W & Anand, A 2016, On the Applicability of Delicious for Temporal Searchon Web Archives. in SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 929-932, 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, 17 Jul 2016. https://doi.org/10.1145/2911451.2914724

Holzmann, H., Nejdl, W., & Anand, A. (2016). On the Applicability of Delicious for Temporal Searchon Web Archives. In SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 929-932). (SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval). https://doi.org/10.1145/2911451.2914724

Holzmann H, Nejdl W, Anand A. On the Applicability of Delicious for Temporal Searchon Web Archives. In SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016. p. 929-932. (SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval). doi: 10.1145/2911451.2914724

Holzmann, Helge ; Nejdl, Wolfgang ; Anand, Avishek. / On the Applicability of Delicious for Temporal Searchon Web Archives. SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016. pp. 929-932 (SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval).

Download

@inproceedings{94e5a581cea940a79071f0b65ecbb2c5,

title = "On the Applicability of Delicious for Temporal Searchon Web Archives",

abstract = "Web archives are large longitudinal collections that store webpages from the past, which might be missing on the current live Web. Consequently, temporal search over such collections is essential for finding prominent missing webpages and tasks like historical analysis. However, this has been challenging due to the lack of popularity information and proper ground truth to evaluate temporal retrieval models. In this paper we investigate the applicability of external longitudinal resources to identify important and popular websites in the past and analyze the social bookmarking service Delicious for this purpose. The timestamped bookmarks on Delicious provide explicit cues about popular time periods in the past along with relevant descriptors. These are valuable to identify important documents in the past for a given temporal query. Focusing purely on recall, we analyzed more than 12,000 queries and find that using Delicious yields average recall values from 46% up to 100%, when limiting ourselves to the best represented queries in the considered dataset. This constitutes an attractive and low-overhead approach for quick access into Web archives by not dealing with the actual contents.",

author = "Helge Holzmann and Wolfgang Nejdl and Avishek Anand",

year = "2016",

month = jul,

day = "7",

doi = "10.1145/2911451.2914724",

language = "English",

series = "SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval",

pages = "929--932",

booktitle = "SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval",

note = "39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016 ; Conference date: 17-07-2016 Through 21-07-2016",

}

Download

TY - GEN

T1 - On the Applicability of Delicious for Temporal Searchon Web Archives

AU - Holzmann, Helge

AU - Nejdl, Wolfgang

AU - Anand, Avishek

PY - 2016/7/7

Y1 - 2016/7/7

N2 - Web archives are large longitudinal collections that store webpages from the past, which might be missing on the current live Web. Consequently, temporal search over such collections is essential for finding prominent missing webpages and tasks like historical analysis. However, this has been challenging due to the lack of popularity information and proper ground truth to evaluate temporal retrieval models. In this paper we investigate the applicability of external longitudinal resources to identify important and popular websites in the past and analyze the social bookmarking service Delicious for this purpose. The timestamped bookmarks on Delicious provide explicit cues about popular time periods in the past along with relevant descriptors. These are valuable to identify important documents in the past for a given temporal query. Focusing purely on recall, we analyzed more than 12,000 queries and find that using Delicious yields average recall values from 46% up to 100%, when limiting ourselves to the best represented queries in the considered dataset. This constitutes an attractive and low-overhead approach for quick access into Web archives by not dealing with the actual contents.

AB - Web archives are large longitudinal collections that store webpages from the past, which might be missing on the current live Web. Consequently, temporal search over such collections is essential for finding prominent missing webpages and tasks like historical analysis. However, this has been challenging due to the lack of popularity information and proper ground truth to evaluate temporal retrieval models. In this paper we investigate the applicability of external longitudinal resources to identify important and popular websites in the past and analyze the social bookmarking service Delicious for this purpose. The timestamped bookmarks on Delicious provide explicit cues about popular time periods in the past along with relevant descriptors. These are valuable to identify important documents in the past for a given temporal query. Focusing purely on recall, we analyzed more than 12,000 queries and find that using Delicious yields average recall values from 46% up to 100%, when limiting ourselves to the best represented queries in the considered dataset. This constitutes an attractive and low-overhead approach for quick access into Web archives by not dealing with the actual contents.

UR - http://www.scopus.com/inward/record.url?scp=84980367608&partnerID=8YFLogxK

U2 - 10.1145/2911451.2914724

DO - 10.1145/2911451.2914724

M3 - Conference contribution

AN - SCOPUS:84980367608

T3 - SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval

SP - 929

EP - 932

BT - SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval

T2 - 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016

Y2 - 17 July 2016 through 21 July 2016

ER -

Research@Leibniz University

On the Applicability of Delicious for Temporal Searchon Web Archives

Authors

Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets

Open benchmark for filtering techniques in entity resolution

Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions

Adaptive Dispatching of Mobile Charging Stations using Multi-Agent Graph Convolutional Cooperative-Competitive Reinforcement Learning

Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets

Open benchmark for filtering techniques in entity resolution

Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions

Adaptive Dispatching of Mobile Charging Stations using Multi-Agent Graph Convolutional Cooperative-Competitive Reinforcement Learning

Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets