Incremental Diversification for Very Large Sets: A Streaming-based Approach

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksSIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten585-594
Seitenumfang10
ISBN (Print)9781450309349
PublikationsstatusVeröffentlicht - 24 Juli 2011
Veranstaltung34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011 - Beijing, China
Dauer: 24 Juli 201128 Juli 2011

Publikationsreihe

NameSIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval

Abstract

Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set is NP-hard for the usual objectives. Existing greedy diversification algorithms require random access to the input set, rendering them impractical in the context of large result sets or continuous data. To solve this issue, we present a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream. Our approach exhibits a linear computation and constant memory complexity with respect to input size, without significant loss of diversification quality. In an extensive evaluation on several real-world data sets, we show the applicability and efficiency of our algorithm for large result sets as well as for continuous query scenarios such as news stream subscriptions.

ASJC Scopus Sachgebiete

Zitieren

Incremental Diversification for Very Large Sets: A Streaming-based Approach. / Minack, Enrico; Siberski, Wolf; Nejdl, Wolfgang.
SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery (ACM), 2011. S. 585-594 (SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Minack, E, Siberski, W & Nejdl, W 2011, Incremental Diversification for Very Large Sets: A Streaming-based Approach. in SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery (ACM), S. 585-594, 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, 24 Juli 2011. https://doi.org/10.1145/2009916.2009996
Minack, E., Siberski, W., & Nejdl, W. (2011). Incremental Diversification for Very Large Sets: A Streaming-based Approach. In SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (S. 585-594). (SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval). Association for Computing Machinery (ACM). https://doi.org/10.1145/2009916.2009996
Minack E, Siberski W, Nejdl W. Incremental Diversification for Very Large Sets: A Streaming-based Approach. in SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery (ACM). 2011. S. 585-594. (SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval). doi: 10.1145/2009916.2009996
Minack, Enrico ; Siberski, Wolf ; Nejdl, Wolfgang. / Incremental Diversification for Very Large Sets : A Streaming-based Approach. SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery (ACM), 2011. S. 585-594 (SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval).
Download
@inproceedings{028a68cf60a94c9ba57f4773fa52e8bb,
title = "Incremental Diversification for Very Large Sets: A Streaming-based Approach",
abstract = "Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set is NP-hard for the usual objectives. Existing greedy diversification algorithms require random access to the input set, rendering them impractical in the context of large result sets or continuous data. To solve this issue, we present a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream. Our approach exhibits a linear computation and constant memory complexity with respect to input size, without significant loss of diversification quality. In an extensive evaluation on several real-world data sets, we show the applicability and efficiency of our algorithm for large result sets as well as for continuous query scenarios such as news stream subscriptions.",
keywords = "Approximation, Diversification, Large sets, Streams",
author = "Enrico Minack and Wolf Siberski and Wolfgang Nejdl",
year = "2011",
month = jul,
day = "24",
doi = "10.1145/2009916.2009996",
language = "English",
isbn = "9781450309349",
series = "SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval",
publisher = "Association for Computing Machinery (ACM)",
pages = "585--594",
booktitle = "SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval",
address = "United States",
note = "34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011 ; Conference date: 24-07-2011 Through 28-07-2011",

}

Download

TY - GEN

T1 - Incremental Diversification for Very Large Sets

T2 - 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011

AU - Minack, Enrico

AU - Siberski, Wolf

AU - Nejdl, Wolfgang

PY - 2011/7/24

Y1 - 2011/7/24

N2 - Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set is NP-hard for the usual objectives. Existing greedy diversification algorithms require random access to the input set, rendering them impractical in the context of large result sets or continuous data. To solve this issue, we present a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream. Our approach exhibits a linear computation and constant memory complexity with respect to input size, without significant loss of diversification quality. In an extensive evaluation on several real-world data sets, we show the applicability and efficiency of our algorithm for large result sets as well as for continuous query scenarios such as news stream subscriptions.

AB - Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set is NP-hard for the usual objectives. Existing greedy diversification algorithms require random access to the input set, rendering them impractical in the context of large result sets or continuous data. To solve this issue, we present a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream. Our approach exhibits a linear computation and constant memory complexity with respect to input size, without significant loss of diversification quality. In an extensive evaluation on several real-world data sets, we show the applicability and efficiency of our algorithm for large result sets as well as for continuous query scenarios such as news stream subscriptions.

KW - Approximation

KW - Diversification

KW - Large sets

KW - Streams

UR - http://www.scopus.com/inward/record.url?scp=80052136714&partnerID=8YFLogxK

U2 - 10.1145/2009916.2009996

DO - 10.1145/2009916.2009996

M3 - Conference contribution

AN - SCOPUS:80052136714

SN - 9781450309349

T3 - SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval

SP - 585

EP - 594

BT - SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery (ACM)

Y2 - 24 July 2011 through 28 July 2011

ER -

Von denselben Autoren