DL meets P2P: Distributed document retrieval based on classification and content

Wolf Tilo Balke; Wolfgang Nejdl; Wolf Siberski; Uwe Thaden

doi:10.1007/11551362_34

Details

Original language	English
Title of host publication	ECDL 2005
Subtitle of host publication	Research and Advanced Technology for Digital Libraries
Pages	379-390
Number of pages	12
ISBN (electronic)	978-3-540-31931-3
Publication status	Published - 2005
Event	9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005 - Vienna, Austria Duration: 18 Sept 2005 → 23 Sept 2005

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	3652 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

DL meets P2P: Distributed document retrieval based on classification and content. / Balke, Wolf Tilo; Nejdl, Wolfgang; Siberski, Wolf et al.
ECDL 2005: Research and Advanced Technology for Digital Libraries . 2005. p. 379-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3652 LNCS).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Balke, WT, Nejdl, W, Siberski, W & Thaden, U 2005, DL meets P2P: Distributed document retrieval based on classification and content. in ECDL 2005: Research and Advanced Technology for Digital Libraries . Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3652 LNCS, pp. 379-390, 9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005, Vienna, Austria, 18 Sept 2005. https://doi.org/10.1007/11551362_34

Balke, W. T., Nejdl, W., Siberski, W., & Thaden, U. (2005). DL meets P2P: Distributed document retrieval based on classification and content. In ECDL 2005: Research and Advanced Technology for Digital Libraries (pp. 379-390). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3652 LNCS). https://doi.org/10.1007/11551362_34

Balke WT, Nejdl W, Siberski W, Thaden U. DL meets P2P: Distributed document retrieval based on classification and content. In ECDL 2005: Research and Advanced Technology for Digital Libraries . 2005. p. 379-390. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/11551362_34

Balke, Wolf Tilo ; Nejdl, Wolfgang ; Siberski, Wolf et al. / DL meets P2P : Distributed document retrieval based on classification and content. ECDL 2005: Research and Advanced Technology for Digital Libraries . 2005. pp. 379-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{eebc5a0e31ec47ce8cb66b89291b94d1,

title = "DL meets P2P: Distributed document retrieval based on classification and content",

abstract = "Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.",

author = "Balke, {Wolf Tilo} and Wolfgang Nejdl and Wolf Siberski and Uwe Thaden",

year = "2005",

doi = "10.1007/11551362_34",

language = "English",

isbn = "978-3-540-28767-4",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "379--390",

booktitle = "ECDL 2005",

note = "9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005 ; Conference date: 18-09-2005 Through 23-09-2005",

}

Download

TY - GEN

T1 - DL meets P2P

T2 - 9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005

AU - Balke, Wolf Tilo

AU - Nejdl, Wolfgang

AU - Siberski, Wolf

AU - Thaden, Uwe

PY - 2005

Y1 - 2005

N2 - Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.

AB - Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such 'collection-wide' information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme's efficiency and scalability.

UR - http://www.scopus.com/inward/record.url?scp=33645994781&partnerID=8YFLogxK

U2 - 10.1007/11551362_34

DO - 10.1007/11551362_34

M3 - Conference contribution

AN - SCOPUS:33645994781

SN - 978-3-540-28767-4

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 379

EP - 390

BT - ECDL 2005

Y2 - 18 September 2005 through 23 September 2005

ER -

Research@Leibniz University

DL meets P2P: Distributed document retrieval based on classification and content

Authors

Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

Adaptive Dispatching of Mobile Charging Stations using Multi-Agent Graph Convolutional Cooperative-Competitive Reinforcement Learning

Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets

Open benchmark for filtering techniques in entity resolution

Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions