DL meets P2P: Distributed document retrieval based on classification and content

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationECDL 2005
Subtitle of host publicationResearch and Advanced Technology for Digital Libraries
Pages379-390
Number of pages12
ISBN (electronic)978-3-540-31931-3
Publication statusPublished - 2005
Event9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005 - Vienna, Austria
Duration: 18 Sept 200523 Sept 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3652 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.

ASJC Scopus subject areas

Cite this

DL meets P2P: Distributed document retrieval based on classification and content. / Balke, Wolf Tilo; Nejdl, Wolfgang; Siberski, Wolf et al.
ECDL 2005: Research and Advanced Technology for Digital Libraries . 2005. p. 379-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3652 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Balke, WT, Nejdl, W, Siberski, W & Thaden, U 2005, DL meets P2P: Distributed document retrieval based on classification and content. in ECDL 2005: Research and Advanced Technology for Digital Libraries . Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3652 LNCS, pp. 379-390, 9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005, Vienna, Austria, 18 Sept 2005. https://doi.org/10.1007/11551362_34
Balke, W. T., Nejdl, W., Siberski, W., & Thaden, U. (2005). DL meets P2P: Distributed document retrieval based on classification and content. In ECDL 2005: Research and Advanced Technology for Digital Libraries (pp. 379-390). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3652 LNCS). https://doi.org/10.1007/11551362_34
Balke WT, Nejdl W, Siberski W, Thaden U. DL meets P2P: Distributed document retrieval based on classification and content. In ECDL 2005: Research and Advanced Technology for Digital Libraries . 2005. p. 379-390. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/11551362_34
Balke, Wolf Tilo ; Nejdl, Wolfgang ; Siberski, Wolf et al. / DL meets P2P : Distributed document retrieval based on classification and content. ECDL 2005: Research and Advanced Technology for Digital Libraries . 2005. pp. 379-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{eebc5a0e31ec47ce8cb66b89291b94d1,
title = "DL meets P2P: Distributed document retrieval based on classification and content",
abstract = "Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.",
author = "Balke, {Wolf Tilo} and Wolfgang Nejdl and Wolf Siberski and Uwe Thaden",
year = "2005",
doi = "10.1007/11551362_34",
language = "English",
isbn = "978-3-540-28767-4",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "379--390",
booktitle = "ECDL 2005",
note = "9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005 ; Conference date: 18-09-2005 Through 23-09-2005",

}

Download

TY - GEN

T1 - DL meets P2P

T2 - 9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005

AU - Balke, Wolf Tilo

AU - Nejdl, Wolfgang

AU - Siberski, Wolf

AU - Thaden, Uwe

PY - 2005

Y1 - 2005

N2 - Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.

AB - Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.

UR - http://www.scopus.com/inward/record.url?scp=33645994781&partnerID=8YFLogxK

U2 - 10.1007/11551362_34

DO - 10.1007/11551362_34

M3 - Conference contribution

AN - SCOPUS:33645994781

SN - 978-3-540-28767-4

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 379

EP - 390

BT - ECDL 2005

Y2 - 18 September 2005 through 23 September 2005

ER -

By the same author(s)