Site level noise removal for search engines

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • André Luiz Da Costa Carvalho
  • Paul Alexandru Chirita
  • Edleno Silva De Moura
  • Pável Calado
  • Wolfgang Nejdl

Organisationseinheiten

Externe Organisationen

  • Universidade Federal do Amazonas
  • INESC-ID
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of the 15th International Conference on World Wide Web
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten73-82
Seitenumfang10
ISBN (Print)1595933239, 9781595933232
PublikationsstatusVeröffentlicht - 23 Mai 2006
Veranstaltung15th International Conference on World Wide Web - Edinburgh, Scotland, Großbritannien / Vereinigtes Königreich
Dauer: 23 Mai 200626 Mai 2006

Publikationsreihe

NameProceedings of the 15th International Conference on World Wide Web

Abstract

The currently booming search engine industry has determined many online organizations to attempt to artificially increase their ranking in order to attract more visitors to their web sites. At the same time, the growth of the web has also inherently generated several navigational hyperlink structures that have a negative impact on the importance measures employed by current search engines. In this paper we propose and evaluate algorithms for identifying all these noisy links on the web graph, may them be spam or simple relationships between real world entities represented by sites, replication of content, etc. Unlike prior work, we target a different type of noisy link structures, residing at the site level, instead of the page level. We thus investigate and annihilate site level mutual reinforcement relationships, abnormal support coming from one site towards another, as well as complex link alliances between web sites. Our experiments with the link database of the TodoBR search engine show a very strong increase in the quality of the output rankings after having applied our techniques.

ASJC Scopus Sachgebiete

Zitieren

Site level noise removal for search engines. / Da Costa Carvalho, André Luiz; Chirita, Paul Alexandru; De Moura, Edleno Silva et al.
Proceedings of the 15th International Conference on World Wide Web. Association for Computing Machinery (ACM), 2006. S. 73-82 (Proceedings of the 15th International Conference on World Wide Web).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Da Costa Carvalho, AL, Chirita, PA, De Moura, ES, Calado, P & Nejdl, W 2006, Site level noise removal for search engines. in Proceedings of the 15th International Conference on World Wide Web. Proceedings of the 15th International Conference on World Wide Web, Association for Computing Machinery (ACM), S. 73-82, 15th International Conference on World Wide Web, Edinburgh, Scotland, Großbritannien / Vereinigtes Königreich, 23 Mai 2006. https://doi.org/10.1145/1135777.1135793
Da Costa Carvalho, A. L., Chirita, P. A., De Moura, E. S., Calado, P., & Nejdl, W. (2006). Site level noise removal for search engines. In Proceedings of the 15th International Conference on World Wide Web (S. 73-82). (Proceedings of the 15th International Conference on World Wide Web). Association for Computing Machinery (ACM). https://doi.org/10.1145/1135777.1135793
Da Costa Carvalho AL, Chirita PA, De Moura ES, Calado P, Nejdl W. Site level noise removal for search engines. in Proceedings of the 15th International Conference on World Wide Web. Association for Computing Machinery (ACM). 2006. S. 73-82. (Proceedings of the 15th International Conference on World Wide Web). doi: 10.1145/1135777.1135793
Da Costa Carvalho, André Luiz ; Chirita, Paul Alexandru ; De Moura, Edleno Silva et al. / Site level noise removal for search engines. Proceedings of the 15th International Conference on World Wide Web. Association for Computing Machinery (ACM), 2006. S. 73-82 (Proceedings of the 15th International Conference on World Wide Web).
Download
@inproceedings{855776d89ec8461b8fd64c01f3264d16,
title = "Site level noise removal for search engines",
abstract = "The currently booming search engine industry has determined many online organizations to attempt to artificially increase their ranking in order to attract more visitors to their web sites. At the same time, the growth of the web has also inherently generated several navigational hyperlink structures that have a negative impact on the importance measures employed by current search engines. In this paper we propose and evaluate algorithms for identifying all these noisy links on the web graph, may them be spam or simple relationships between real world entities represented by sites, replication of content, etc. Unlike prior work, we target a different type of noisy link structures, residing at the site level, instead of the page level. We thus investigate and annihilate site level mutual reinforcement relationships, abnormal support coming from one site towards another, as well as complex link alliances between web sites. Our experiments with the link database of the TodoBR search engine show a very strong increase in the quality of the output rankings after having applied our techniques.",
keywords = "Link analysis, Noise reduction, PageRank, Spam",
author = "{Da Costa Carvalho}, {Andr{\'e} Luiz} and Chirita, {Paul Alexandru} and {De Moura}, {Edleno Silva} and P{\'a}vel Calado and Wolfgang Nejdl",
year = "2006",
month = may,
day = "23",
doi = "10.1145/1135777.1135793",
language = "English",
isbn = "1595933239",
series = "Proceedings of the 15th International Conference on World Wide Web",
publisher = "Association for Computing Machinery (ACM)",
pages = "73--82",
booktitle = "Proceedings of the 15th International Conference on World Wide Web",
address = "United States",
note = "15th International Conference on World Wide Web ; Conference date: 23-05-2006 Through 26-05-2006",

}

Download

TY - GEN

T1 - Site level noise removal for search engines

AU - Da Costa Carvalho, André Luiz

AU - Chirita, Paul Alexandru

AU - De Moura, Edleno Silva

AU - Calado, Pável

AU - Nejdl, Wolfgang

PY - 2006/5/23

Y1 - 2006/5/23

N2 - The currently booming search engine industry has determined many online organizations to attempt to artificially increase their ranking in order to attract more visitors to their web sites. At the same time, the growth of the web has also inherently generated several navigational hyperlink structures that have a negative impact on the importance measures employed by current search engines. In this paper we propose and evaluate algorithms for identifying all these noisy links on the web graph, may them be spam or simple relationships between real world entities represented by sites, replication of content, etc. Unlike prior work, we target a different type of noisy link structures, residing at the site level, instead of the page level. We thus investigate and annihilate site level mutual reinforcement relationships, abnormal support coming from one site towards another, as well as complex link alliances between web sites. Our experiments with the link database of the TodoBR search engine show a very strong increase in the quality of the output rankings after having applied our techniques.

AB - The currently booming search engine industry has determined many online organizations to attempt to artificially increase their ranking in order to attract more visitors to their web sites. At the same time, the growth of the web has also inherently generated several navigational hyperlink structures that have a negative impact on the importance measures employed by current search engines. In this paper we propose and evaluate algorithms for identifying all these noisy links on the web graph, may them be spam or simple relationships between real world entities represented by sites, replication of content, etc. Unlike prior work, we target a different type of noisy link structures, residing at the site level, instead of the page level. We thus investigate and annihilate site level mutual reinforcement relationships, abnormal support coming from one site towards another, as well as complex link alliances between web sites. Our experiments with the link database of the TodoBR search engine show a very strong increase in the quality of the output rankings after having applied our techniques.

KW - Link analysis

KW - Noise reduction

KW - PageRank

KW - Spam

UR - http://www.scopus.com/inward/record.url?scp=34250686269&partnerID=8YFLogxK

U2 - 10.1145/1135777.1135793

DO - 10.1145/1135777.1135793

M3 - Conference contribution

AN - SCOPUS:34250686269

SN - 1595933239

SN - 9781595933232

T3 - Proceedings of the 15th International Conference on World Wide Web

SP - 73

EP - 82

BT - Proceedings of the 15th International Conference on World Wide Web

PB - Association for Computing Machinery (ACM)

T2 - 15th International Conference on World Wide Web

Y2 - 23 May 2006 through 26 May 2006

ER -

Von denselben Autoren