DHTs over peer clusters for distributed information retrieval

Odysseas Papapetrou; Wolf Siberski; Wolf Tilo Balke; Wolfgang Nejdl

doi:10.1109/AINA.2007.60

Details

Original language	English
Title of host publication	Proceedings
Subtitle of host publication	21st International Conference on Advanced Information Networking and Applications, AINA 2007
Pages	84-93
Number of pages	10
ISBN (electronic)	978-1-5090-8717-4
Publication status	Published - 2007
Event	21st International Conference on Advanced Information Networking and Applications (AINA 2007) - Niagara Falls, ON, Canada Duration: 21 May 2007 → 23 May 2007 Conference number: 21

Publication series

Name	Proceedings - International Conference on Advanced Information Networking and Applications, AINA
ISSN (Print)	1550-445X

Abstract

Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.

ASJC Scopus subject areas

Engineering(all)
General Engineering

Cite this

DHTs over peer clusters for distributed information retrieval. / Papapetrou, Odysseas; Siberski, Wolf; Balke, Wolf Tilo et al.
Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. p. 84-93 4220880 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Papapetrou, O, Siberski, W, Balke, WT & Nejdl, W 2007, DHTs over peer clusters for distributed information retrieval. in Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007., 4220880, Proceedings - International Conference on Advanced Information Networking and Applications, AINA, pp. 84-93, 21st International Conference on Advanced Information Networking and Applications (AINA 2007), Niagara Falls, ON, Canada, 21 May 2007. https://doi.org/10.1109/AINA.2007.60

Papapetrou, O., Siberski, W., Balke, W. T., & Nejdl, W. (2007). DHTs over peer clusters for distributed information retrieval. In Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007 (pp. 84-93). Article 4220880 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA). https://doi.org/10.1109/AINA.2007.60

Papapetrou O, Siberski W, Balke WT, Nejdl W. DHTs over peer clusters for distributed information retrieval. In Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. p. 84-93. 4220880. (Proceedings - International Conference on Advanced Information Networking and Applications, AINA). doi: 10.1109/AINA.2007.60

Papapetrou, Odysseas ; Siberski, Wolf ; Balke, Wolf Tilo et al. / DHTs over peer clusters for distributed information retrieval. Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. pp. 84-93 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA).

Download

@inproceedings{d94e3580898742bcb8cf8697e82925e2,

title = "DHTs over peer clusters for distributed information retrieval",

abstract = "Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.",

author = "Odysseas Papapetrou and Wolf Siberski and Balke, {Wolf Tilo} and Wolfgang Nejdl",

year = "2007",

doi = "10.1109/AINA.2007.60",

language = "English",

isbn = "0769528465",

series = "Proceedings - International Conference on Advanced Information Networking and Applications, AINA",

pages = "84--93",

booktitle = "Proceedings",

note = "21st International Conference on Advanced Information Networking and Applications (AINA 2007) ; Conference date: 21-05-2007 Through 23-05-2007",

}

Download

TY - GEN

T1 - DHTs over peer clusters for distributed information retrieval

AU - Papapetrou, Odysseas

AU - Siberski, Wolf

AU - Balke, Wolf Tilo

AU - Nejdl, Wolfgang

N1 - Conference code: 21

PY - 2007

Y1 - 2007

N2 - Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.

AB - Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.

UR - http://www.scopus.com/inward/record.url?scp=34548737053&partnerID=8YFLogxK

U2 - 10.1109/AINA.2007.60

DO - 10.1109/AINA.2007.60

M3 - Conference contribution

AN - SCOPUS:34548737053

SN - 0769528465

SN - 9780769528465

T3 - Proceedings - International Conference on Advanced Information Networking and Applications, AINA

SP - 84

EP - 93

BT - Proceedings

T2 - 21st International Conference on Advanced Information Networking and Applications (AINA 2007)

Y2 - 21 May 2007 through 23 May 2007

ER -

Research@Leibniz University

DHTs over peer clusters for distributed information retrieval

Authors

Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets

Open benchmark for filtering techniques in entity resolution

Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions

An artificial intelligence-assisted clinical framework to facilitate diagnostics and translational discovery in hematologic neoplasia