Optimizing Near Duplicate Detection for P2P Networks

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings
PublikationsstatusVeröffentlicht - 22 Nov. 2010
Veranstaltung2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Delft, Niederlande
Dauer: 25 Aug. 201027 Aug. 2010

Publikationsreihe

Name2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings

Abstract

In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To this end, we present a thorough cost and probabilistic analysis that allows the algorithm to adapt to network and data collection characteristics for minimizing network cost. In addition, we extend the algorithm so that it can identify similar videos, even if some of the videos are split into different files. A thorough theoretical analysis as well as a large-scale experimental evaluation on networks of up to 100,000 peers using real-world datasets of more than 200 Gbytes demonstrate the viability of our approach.

ASJC Scopus Sachgebiete

Zitieren

Optimizing Near Duplicate Detection for P2P Networks. / Papapetrou, Odysseas; Ramesh, Sukriti; Siersdorfer, Stefan et al.
2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings. 2010. 5570001 (2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Papapetrou, O, Ramesh, S, Siersdorfer, S & Nejdl, W 2010, Optimizing Near Duplicate Detection for P2P Networks. in 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings., 5570001, 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings, 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010, Delft, Niederlande, 25 Aug. 2010. https://doi.org/10.1109/P2P.2010.5570001
Papapetrou, O., Ramesh, S., Siersdorfer, S., & Nejdl, W. (2010). Optimizing Near Duplicate Detection for P2P Networks. In 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings Artikel 5570001 (2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings). https://doi.org/10.1109/P2P.2010.5570001
Papapetrou O, Ramesh S, Siersdorfer S, Nejdl W. Optimizing Near Duplicate Detection for P2P Networks. in 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings. 2010. 5570001. (2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings). doi: 10.1109/P2P.2010.5570001
Papapetrou, Odysseas ; Ramesh, Sukriti ; Siersdorfer, Stefan et al. / Optimizing Near Duplicate Detection for P2P Networks. 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings. 2010. (2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings).
Download
@inproceedings{e1dc42e13c9a48d587a78c5504ed586b,
title = "Optimizing Near Duplicate Detection for P2P Networks",
abstract = "In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To this end, we present a thorough cost and probabilistic analysis that allows the algorithm to adapt to network and data collection characteristics for minimizing network cost. In addition, we extend the algorithm so that it can identify similar videos, even if some of the videos are split into different files. A thorough theoretical analysis as well as a large-scale experimental evaluation on networks of up to 100,000 peers using real-world datasets of more than 200 Gbytes demonstrate the viability of our approach.",
author = "Odysseas Papapetrou and Sukriti Ramesh and Stefan Siersdorfer and Wolfgang Nejdl",
year = "2010",
month = nov,
day = "22",
doi = "10.1109/P2P.2010.5570001",
language = "English",
isbn = "9781424471416",
series = "2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings",
booktitle = "2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings",
note = "2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 ; Conference date: 25-08-2010 Through 27-08-2010",

}

Download

TY - GEN

T1 - Optimizing Near Duplicate Detection for P2P Networks

AU - Papapetrou, Odysseas

AU - Ramesh, Sukriti

AU - Siersdorfer, Stefan

AU - Nejdl, Wolfgang

PY - 2010/11/22

Y1 - 2010/11/22

N2 - In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To this end, we present a thorough cost and probabilistic analysis that allows the algorithm to adapt to network and data collection characteristics for minimizing network cost. In addition, we extend the algorithm so that it can identify similar videos, even if some of the videos are split into different files. A thorough theoretical analysis as well as a large-scale experimental evaluation on networks of up to 100,000 peers using real-world datasets of more than 200 Gbytes demonstrate the viability of our approach.

AB - In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To this end, we present a thorough cost and probabilistic analysis that allows the algorithm to adapt to network and data collection characteristics for minimizing network cost. In addition, we extend the algorithm so that it can identify similar videos, even if some of the videos are split into different files. A thorough theoretical analysis as well as a large-scale experimental evaluation on networks of up to 100,000 peers using real-world datasets of more than 200 Gbytes demonstrate the viability of our approach.

UR - http://www.scopus.com/inward/record.url?scp=78349238180&partnerID=8YFLogxK

U2 - 10.1109/P2P.2010.5570001

DO - 10.1109/P2P.2010.5570001

M3 - Conference contribution

AN - SCOPUS:78349238180

SN - 9781424471416

T3 - 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings

BT - 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010 - Proceedings

T2 - 2010 IEEE 10th International Conference on Peer-to-Peer Computing, P2P 2010

Y2 - 25 August 2010 through 27 August 2010

ER -

Von denselben Autoren