History by Diversity: Helping Historians search News Archives

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksCHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval
Seiten183-192
Seitenumfang10
ISBN (elektronisch)9781450337519
PublikationsstatusVeröffentlicht - 2016
VeranstaltungCHIIR 2016: ACM SIGIR Conference on Human Information Interaction and Retrieval - Carrboro, USA / Vereinigte Staaten
Dauer: 13 März 201617 März 2016

Abstract

Longitudinal corpora like newspaper archives are of immense value to historical research, and time as an important factor for historians strongly influences their search behaviour in these archives. While searching for articles published over time, a key preference is to retrieve documents which cover the important aspects from important points in time which is different from standard search behavior. To support this search strategy, we introduce the notion of a Historical Query Intent to explicitly model a historian's search task and define an aspect-time diversification problem over news archives. We present a novel algorithm, HistDiv, that explicitly models the aspects and important time windows based on a historian's information seeking behavior. By incorporating temporal priors based on publication times and temporal expressions, we diversify both on the aspect and temporal dimensions. We test our methods by constructing a test collection based on The New York Times Collection with a workload of 30 queries of historical intent assessed manually. We find that HistDiv outperforms all competitors in subtopic recall with a slight loss in precision. We also present results of a qualitative user study to determine wether this drop in precision is detrimental to user experience. Our results show that users still preferred HistDiv's ranking.

ASJC Scopus Sachgebiete

Zitieren

History by Diversity: Helping Historians search News Archives. / Singh, Jaspreet; Nejdl, Wolfgang; Anand, Avishek.
CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval. 2016. S. 183-192.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Singh, J, Nejdl, W & Anand, A 2016, History by Diversity: Helping Historians search News Archives. in CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval. S. 183-192, CHIIR 2016, Carrboro, USA / Vereinigte Staaten, 13 März 2016. https://doi.org/10.1145/2854946.2854959
Singh, J., Nejdl, W., & Anand, A. (2016). History by Diversity: Helping Historians search News Archives. In CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval (S. 183-192) https://doi.org/10.1145/2854946.2854959
Singh J, Nejdl W, Anand A. History by Diversity: Helping Historians search News Archives. in CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval. 2016. S. 183-192 doi: 10.1145/2854946.2854959
Singh, Jaspreet ; Nejdl, Wolfgang ; Anand, Avishek. / History by Diversity : Helping Historians search News Archives. CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval. 2016. S. 183-192
Download
@inproceedings{ff8c6308bc784447b4f407d38cb02a5f,
title = "History by Diversity: Helping Historians search News Archives",
abstract = "Longitudinal corpora like newspaper archives are of immense value to historical research, and time as an important factor for historians strongly influences their search behaviour in these archives. While searching for articles published over time, a key preference is to retrieve documents which cover the important aspects from important points in time which is different from standard search behavior. To support this search strategy, we introduce the notion of a Historical Query Intent to explicitly model a historian's search task and define an aspect-time diversification problem over news archives. We present a novel algorithm, HistDiv, that explicitly models the aspects and important time windows based on a historian's information seeking behavior. By incorporating temporal priors based on publication times and temporal expressions, we diversify both on the aspect and temporal dimensions. We test our methods by constructing a test collection based on The New York Times Collection with a workload of 30 queries of historical intent assessed manually. We find that HistDiv outperforms all competitors in subtopic recall with a slight loss in precision. We also present results of a qualitative user study to determine wether this drop in precision is detrimental to user experience. Our results show that users still preferred HistDiv's ranking.",
author = "Jaspreet Singh and Wolfgang Nejdl and Avishek Anand",
note = "Funding information: This work was carried out under the context of the ERC Grant (339233) ALEXANDRIA. We thank Prof. Jane Winters and her colleagues from the Institute of Historical Research at the University College of London for their help and cooperation.; CHIIR 2016 : ACM SIGIR Conference on Human Information Interaction and Retrieval ; Conference date: 13-03-2016 Through 17-03-2016",
year = "2016",
doi = "10.1145/2854946.2854959",
language = "English",
pages = "183--192",
booktitle = "CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval",

}

Download

TY - GEN

T1 - History by Diversity

T2 - CHIIR 2016

AU - Singh, Jaspreet

AU - Nejdl, Wolfgang

AU - Anand, Avishek

N1 - Funding information: This work was carried out under the context of the ERC Grant (339233) ALEXANDRIA. We thank Prof. Jane Winters and her colleagues from the Institute of Historical Research at the University College of London for their help and cooperation.

PY - 2016

Y1 - 2016

N2 - Longitudinal corpora like newspaper archives are of immense value to historical research, and time as an important factor for historians strongly influences their search behaviour in these archives. While searching for articles published over time, a key preference is to retrieve documents which cover the important aspects from important points in time which is different from standard search behavior. To support this search strategy, we introduce the notion of a Historical Query Intent to explicitly model a historian's search task and define an aspect-time diversification problem over news archives. We present a novel algorithm, HistDiv, that explicitly models the aspects and important time windows based on a historian's information seeking behavior. By incorporating temporal priors based on publication times and temporal expressions, we diversify both on the aspect and temporal dimensions. We test our methods by constructing a test collection based on The New York Times Collection with a workload of 30 queries of historical intent assessed manually. We find that HistDiv outperforms all competitors in subtopic recall with a slight loss in precision. We also present results of a qualitative user study to determine wether this drop in precision is detrimental to user experience. Our results show that users still preferred HistDiv's ranking.

AB - Longitudinal corpora like newspaper archives are of immense value to historical research, and time as an important factor for historians strongly influences their search behaviour in these archives. While searching for articles published over time, a key preference is to retrieve documents which cover the important aspects from important points in time which is different from standard search behavior. To support this search strategy, we introduce the notion of a Historical Query Intent to explicitly model a historian's search task and define an aspect-time diversification problem over news archives. We present a novel algorithm, HistDiv, that explicitly models the aspects and important time windows based on a historian's information seeking behavior. By incorporating temporal priors based on publication times and temporal expressions, we diversify both on the aspect and temporal dimensions. We test our methods by constructing a test collection based on The New York Times Collection with a workload of 30 queries of historical intent assessed manually. We find that HistDiv outperforms all competitors in subtopic recall with a slight loss in precision. We also present results of a qualitative user study to determine wether this drop in precision is detrimental to user experience. Our results show that users still preferred HistDiv's ranking.

UR - http://www.scopus.com/inward/record.url?scp=84974602996&partnerID=8YFLogxK

U2 - 10.1145/2854946.2854959

DO - 10.1145/2854946.2854959

M3 - Conference contribution

AN - SCOPUS:84974602996

SP - 183

EP - 192

BT - CHIIR 2016 - Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval

Y2 - 13 March 2016 through 17 March 2016

ER -

Von denselben Autoren