DHTs over peer clusters for distributed information retrieval

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationProceedings
Subtitle of host publication21st International Conference on Advanced Information Networking and Applications, AINA 2007
Pages84-93
Number of pages10
ISBN (electronic)978-1-5090-8717-4
Publication statusPublished - 2007
Event21st International Conference on Advanced Information Networking and Applications (AINA 2007) - Niagara Falls, ON, Canada
Duration: 21 May 200723 May 2007
Conference number: 21

Publication series

NameProceedings - International Conference on Advanced Information Networking and Applications, AINA
ISSN (Print)1550-445X

Abstract

Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.

ASJC Scopus subject areas

Cite this

DHTs over peer clusters for distributed information retrieval. / Papapetrou, Odysseas; Siberski, Wolf; Balke, Wolf Tilo et al.
Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. p. 84-93 4220880 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Papapetrou, O, Siberski, W, Balke, WT & Nejdl, W 2007, DHTs over peer clusters for distributed information retrieval. in Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007., 4220880, Proceedings - International Conference on Advanced Information Networking and Applications, AINA, pp. 84-93, 21st International Conference on Advanced Information Networking and Applications (AINA 2007), Niagara Falls, ON, Canada, 21 May 2007. https://doi.org/10.1109/AINA.2007.60
Papapetrou, O., Siberski, W., Balke, W. T., & Nejdl, W. (2007). DHTs over peer clusters for distributed information retrieval. In Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007 (pp. 84-93). Article 4220880 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA). https://doi.org/10.1109/AINA.2007.60
Papapetrou O, Siberski W, Balke WT, Nejdl W. DHTs over peer clusters for distributed information retrieval. In Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. p. 84-93. 4220880. (Proceedings - International Conference on Advanced Information Networking and Applications, AINA). doi: 10.1109/AINA.2007.60
Papapetrou, Odysseas ; Siberski, Wolf ; Balke, Wolf Tilo et al. / DHTs over peer clusters for distributed information retrieval. Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. pp. 84-93 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA).
Download
@inproceedings{d94e3580898742bcb8cf8697e82925e2,
title = "DHTs over peer clusters for distributed information retrieval",
abstract = "Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.",
author = "Odysseas Papapetrou and Wolf Siberski and Balke, {Wolf Tilo} and Wolfgang Nejdl",
year = "2007",
doi = "10.1109/AINA.2007.60",
language = "English",
isbn = "0769528465",
series = "Proceedings - International Conference on Advanced Information Networking and Applications, AINA",
pages = "84--93",
booktitle = "Proceedings",
note = "21st International Conference on Advanced Information Networking and Applications (AINA 2007) ; Conference date: 21-05-2007 Through 23-05-2007",

}

Download

TY - GEN

T1 - DHTs over peer clusters for distributed information retrieval

AU - Papapetrou, Odysseas

AU - Siberski, Wolf

AU - Balke, Wolf Tilo

AU - Nejdl, Wolfgang

N1 - Conference code: 21

PY - 2007

Y1 - 2007

N2 - Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.

AB - Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.

UR - http://www.scopus.com/inward/record.url?scp=34548737053&partnerID=8YFLogxK

U2 - 10.1109/AINA.2007.60

DO - 10.1109/AINA.2007.60

M3 - Conference contribution

AN - SCOPUS:34548737053

SN - 0769528465

SN - 9780769528465

T3 - Proceedings - International Conference on Advanced Information Networking and Applications, AINA

SP - 84

EP - 93

BT - Proceedings

T2 - 21st International Conference on Advanced Information Networking and Applications (AINA 2007)

Y2 - 21 May 2007 through 23 May 2007

ER -

By the same author(s)