Details
Original language | English |
---|---|
Title of host publication | The Past Web |
Subtitle of host publication | Exploring Web Archives |
Place of Publication | Cham |
Publisher | Springer International Publishing AG |
Pages | 57-67 |
Number of pages | 11 |
ISBN (electronic) | 9783030632915 |
ISBN (print) | 9783030632908 |
Publication status | Published - 1 Jul 2021 |
Abstract
Web archives are an essential information source for research on historical events. However, the large scale and heterogeneity of web archives make it difficult for researchers to access relevant event-specific materials. In this chapter, we discuss methods for creating event-centric collections from large-scale web archives. These methods are manifold and may require manual curation, adopt search or deploy focused crawling. In this chapter, we focus on the crawl-based methods that identify relevant documents in and across web archives and include link networks as context in the resulting collections.
ASJC Scopus subject areas
- Computer Science(all)
- General Computer Science
- Arts and Humanities(all)
- General Arts and Humanities
- Social Sciences(all)
- General Social Sciences
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
The Past Web: Exploring Web Archives. Cham: Springer International Publishing AG, 2021. p. 57-67.
Research output: Chapter in book/report/conference proceeding › Contribution to book/anthology › Research › peer review
}
TY - CHAP
T1 - Creating Event-Centric Collections from Web Archives
AU - Demidova, Elena
AU - Risse, Thomas
PY - 2021/7/1
Y1 - 2021/7/1
N2 - Web archives are an essential information source for research on historical events. However, the large scale and heterogeneity of web archives make it difficult for researchers to access relevant event-specific materials. In this chapter, we discuss methods for creating event-centric collections from large-scale web archives. These methods are manifold and may require manual curation, adopt search or deploy focused crawling. In this chapter, we focus on the crawl-based methods that identify relevant documents in and across web archives and include link networks as context in the resulting collections.
AB - Web archives are an essential information source for research on historical events. However, the large scale and heterogeneity of web archives make it difficult for researchers to access relevant event-specific materials. In this chapter, we discuss methods for creating event-centric collections from large-scale web archives. These methods are manifold and may require manual curation, adopt search or deploy focused crawling. In this chapter, we focus on the crawl-based methods that identify relevant documents in and across web archives and include link networks as context in the resulting collections.
UR - http://www.scopus.com/inward/record.url?scp=85150075966&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-63291-5_6
DO - 10.1007/978-3-030-63291-5_6
M3 - Contribution to book/anthology
AN - SCOPUS:85150075966
SN - 9783030632908
SP - 57
EP - 67
BT - The Past Web
PB - Springer International Publishing AG
CY - Cham
ER -