Details
Original language | English |
---|---|
Title of host publication | Proceedings |
Subtitle of host publication | 21st International Conference on Advanced Information Networking and Applications, AINA 2007 |
Pages | 84-93 |
Number of pages | 10 |
ISBN (electronic) | 978-1-5090-8717-4 |
Publication status | Published - 2007 |
Event | 21st International Conference on Advanced Information Networking and Applications (AINA 2007) - Niagara Falls, ON, Canada Duration: 21 May 2007 → 23 May 2007 Conference number: 21 |
Publication series
Name | Proceedings - International Conference on Advanced Information Networking and Applications, AINA |
---|---|
ISSN (Print) | 1550-445X |
Abstract
Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.
ASJC Scopus subject areas
- Engineering(all)
- General Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Proceedings: 21st International Conference on Advanced Information Networking and Applications, AINA 2007. 2007. p. 84-93 4220880 (Proceedings - International Conference on Advanced Information Networking and Applications, AINA).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - DHTs over peer clusters for distributed information retrieval
AU - Papapetrou, Odysseas
AU - Siberski, Wolf
AU - Balke, Wolf Tilo
AU - Nejdl, Wolfgang
N1 - Conference code: 21
PY - 2007
Y1 - 2007
N2 - Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.
AB - Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently large numbers of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance. We show that this can be achieved by combining DHTs with peer clustering. Peers are first clustered into communities, each of the communities having a representative super-peer. Then all occurrences of a term in a community are published to the global DHT in a batch by the representative super-peer. Our evaluation shows that this reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.
UR - http://www.scopus.com/inward/record.url?scp=34548737053&partnerID=8YFLogxK
U2 - 10.1109/AINA.2007.60
DO - 10.1109/AINA.2007.60
M3 - Conference contribution
AN - SCOPUS:34548737053
SN - 0769528465
SN - 9780769528465
T3 - Proceedings - International Conference on Advanced Information Networking and Applications, AINA
SP - 84
EP - 93
BT - Proceedings
T2 - 21st International Conference on Advanced Information Networking and Applications (AINA 2007)
Y2 - 21 May 2007 through 23 May 2007
ER -