The Dawn of Today's Popular Domains: A Study of the Archived German Web over 18 Years

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationJCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages73-82
Number of pages10
ISBN (electronic)9781450342292
Publication statusPublished - 2016
Event16th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2016 - Newark, United States
Duration: 19 Jun 201623 Jun 2016

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
Volume2016-September
ISSN (Print)1552-5996

Abstract

The Web has been around and maturing for 25 years. The popular websites of today have undergone vast changes during this period, with a few being there almost since the beginning and many new ones becoming popular over the years. This makes it worthwhile to take a look at how these sites have evolved and what they might tell us about the future of the Web. We therefore embarked on a longitudinal study spanning almost the whole period of the Web, based on data collected by the Internet Archive starting in 1996, to retrospectively analyze how the popular Web as of now has evolved over the past 18 years. For our study we focused on the German Web, specifically on the top 100 most popular websites in 17 categories. This paper presents a selection of the most interesting findings in terms of volume, size as well as age of the Web. While related work in the field of Web Dynamics has mainly focused on change rates and analyzed datasets spanning less than a year, we looked at the evolution of websites over 18 years. We found that around 70% of the pages we investigated are younger than a year, with an observed exponential growth in age as well as in size up to now. If this growth rate continues, the number of pages from the popular domains will almost double in the next two years. In addition, we give insights into our data set, provided by the Internet Archive, which hosts the largest and most complete Web archive as of today.

Keywords

    Analysis, Longitudinal, Retrospective, Statistics, Web Dynamics

ASJC Scopus subject areas

Cite this

The Dawn of Today's Popular Domains: A Study of the Archived German Web over 18 Years. / Holzmann, Helge; Nejdl, Wolfgang; Anand, Avishek.
JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries. Institute of Electrical and Electronics Engineers Inc., 2016. p. 73-82 7559567 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries; Vol. 2016-September).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Holzmann, H, Nejdl, W & Anand, A 2016, The Dawn of Today's Popular Domains: A Study of the Archived German Web over 18 Years. in JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries., 7559567, Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, vol. 2016-September, Institute of Electrical and Electronics Engineers Inc., pp. 73-82, 16th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2016, Newark, United States, 19 Jun 2016. https://doi.org/10.1145/2910896.2910901
Holzmann, H., Nejdl, W., & Anand, A. (2016). The Dawn of Today's Popular Domains: A Study of the Archived German Web over 18 Years. In JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (pp. 73-82). Article 7559567 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries; Vol. 2016-September). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1145/2910896.2910901
Holzmann H, Nejdl W, Anand A. The Dawn of Today's Popular Domains: A Study of the Archived German Web over 18 Years. In JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries. Institute of Electrical and Electronics Engineers Inc. 2016. p. 73-82. 7559567. (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). doi: 10.1145/2910896.2910901
Holzmann, Helge ; Nejdl, Wolfgang ; Anand, Avishek. / The Dawn of Today's Popular Domains : A Study of the Archived German Web over 18 Years. JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 73-82 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).
Download
@inproceedings{6663e7a437e1413aa968bb9ea503d148,
title = "The Dawn of Today's Popular Domains: A Study of the Archived German Web over 18 Years",
abstract = "The Web has been around and maturing for 25 years. The popular websites of today have undergone vast changes during this period, with a few being there almost since the beginning and many new ones becoming popular over the years. This makes it worthwhile to take a look at how these sites have evolved and what they might tell us about the future of the Web. We therefore embarked on a longitudinal study spanning almost the whole period of the Web, based on data collected by the Internet Archive starting in 1996, to retrospectively analyze how the popular Web as of now has evolved over the past 18 years. For our study we focused on the German Web, specifically on the top 100 most popular websites in 17 categories. This paper presents a selection of the most interesting findings in terms of volume, size as well as age of the Web. While related work in the field of Web Dynamics has mainly focused on change rates and analyzed datasets spanning less than a year, we looked at the evolution of websites over 18 years. We found that around 70% of the pages we investigated are younger than a year, with an observed exponential growth in age as well as in size up to now. If this growth rate continues, the number of pages from the popular domains will almost double in the next two years. In addition, we give insights into our data set, provided by the Internet Archive, which hosts the largest and most complete Web archive as of today.",
keywords = "Analysis, Longitudinal, Retrospective, Statistics, Web Dynamics",
author = "Helge Holzmann and Wolfgang Nejdl and Avishek Anand",
year = "2016",
doi = "10.1145/2910896.2910901",
language = "English",
series = "Proceedings of the ACM/IEEE Joint Conference on Digital Libraries",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "73--82",
booktitle = "JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries",
address = "United States",
note = "16th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2016 ; Conference date: 19-06-2016 Through 23-06-2016",

}

Download

TY - GEN

T1 - The Dawn of Today's Popular Domains

T2 - 16th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2016

AU - Holzmann, Helge

AU - Nejdl, Wolfgang

AU - Anand, Avishek

PY - 2016

Y1 - 2016

N2 - The Web has been around and maturing for 25 years. The popular websites of today have undergone vast changes during this period, with a few being there almost since the beginning and many new ones becoming popular over the years. This makes it worthwhile to take a look at how these sites have evolved and what they might tell us about the future of the Web. We therefore embarked on a longitudinal study spanning almost the whole period of the Web, based on data collected by the Internet Archive starting in 1996, to retrospectively analyze how the popular Web as of now has evolved over the past 18 years. For our study we focused on the German Web, specifically on the top 100 most popular websites in 17 categories. This paper presents a selection of the most interesting findings in terms of volume, size as well as age of the Web. While related work in the field of Web Dynamics has mainly focused on change rates and analyzed datasets spanning less than a year, we looked at the evolution of websites over 18 years. We found that around 70% of the pages we investigated are younger than a year, with an observed exponential growth in age as well as in size up to now. If this growth rate continues, the number of pages from the popular domains will almost double in the next two years. In addition, we give insights into our data set, provided by the Internet Archive, which hosts the largest and most complete Web archive as of today.

AB - The Web has been around and maturing for 25 years. The popular websites of today have undergone vast changes during this period, with a few being there almost since the beginning and many new ones becoming popular over the years. This makes it worthwhile to take a look at how these sites have evolved and what they might tell us about the future of the Web. We therefore embarked on a longitudinal study spanning almost the whole period of the Web, based on data collected by the Internet Archive starting in 1996, to retrospectively analyze how the popular Web as of now has evolved over the past 18 years. For our study we focused on the German Web, specifically on the top 100 most popular websites in 17 categories. This paper presents a selection of the most interesting findings in terms of volume, size as well as age of the Web. While related work in the field of Web Dynamics has mainly focused on change rates and analyzed datasets spanning less than a year, we looked at the evolution of websites over 18 years. We found that around 70% of the pages we investigated are younger than a year, with an observed exponential growth in age as well as in size up to now. If this growth rate continues, the number of pages from the popular domains will almost double in the next two years. In addition, we give insights into our data set, provided by the Internet Archive, which hosts the largest and most complete Web archive as of today.

KW - Analysis

KW - Longitudinal

KW - Retrospective

KW - Statistics

KW - Web Dynamics

UR - http://www.scopus.com/inward/record.url?scp=84989902968&partnerID=8YFLogxK

U2 - 10.1145/2910896.2910901

DO - 10.1145/2910896.2910901

M3 - Conference contribution

AN - SCOPUS:84989902968

T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

SP - 73

EP - 82

BT - JCDL 2016 - Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 19 June 2016 through 23 June 2016

ER -

By the same author(s)