Using Site-Level Connections to Estimate Link Confidence

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

  • Jucimar Souza
  • André Carvalho
  • Marco Cristo
  • Edleno Moura
  • Pavel Calado
  • Paul Alexandru Chirita
  • Wolfgang Nejdl

Organisationseinheiten

Externe Organisationen

  • Universidade Federal do Amazonas
  • INESC-ID
  • Adobe Systems Incorporated
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)2294-2312
Seitenumfang19
FachzeitschriftJournal of the American Society for Information Science and Technology
Jahrgang63
Ausgabenummer11
PublikationsstatusVeröffentlicht - 16 Okt. 2012

Abstract

Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.

ASJC Scopus Sachgebiete

Zitieren

Using Site-Level Connections to Estimate Link Confidence. / Souza, Jucimar; Carvalho, André; Cristo, Marco et al.
in: Journal of the American Society for Information Science and Technology, Jahrgang 63, Nr. 11, 16.10.2012, S. 2294-2312.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Souza J, Carvalho A, Cristo M, Moura E, Calado P, Chirita PA et al. Using Site-Level Connections to Estimate Link Confidence. Journal of the American Society for Information Science and Technology. 2012 Okt 16;63(11):2294-2312. doi: 10.1002/asi.22729
Souza, Jucimar ; Carvalho, André ; Cristo, Marco et al. / Using Site-Level Connections to Estimate Link Confidence. in: Journal of the American Society for Information Science and Technology. 2012 ; Jahrgang 63, Nr. 11. S. 2294-2312.
Download
@article{2354b992202c422988cd686830a19db7,
title = "Using Site-Level Connections to Estimate Link Confidence",
abstract = "Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.",
keywords = "information retrieval software, information storage and retrieval systems, search engines",
author = "Jucimar Souza and Andr{\'e} Carvalho and Marco Cristo and Edleno Moura and Pavel Calado and Chirita, {Paul Alexandru} and Wolfgang Nejdl",
year = "2012",
month = oct,
day = "16",
doi = "10.1002/asi.22729",
language = "English",
volume = "63",
pages = "2294--2312",
journal = "Journal of the American Society for Information Science and Technology",
issn = "1532-2882",
publisher = "John Wiley and Sons Inc.",
number = "11",

}

Download

TY - JOUR

T1 - Using Site-Level Connections to Estimate Link Confidence

AU - Souza, Jucimar

AU - Carvalho, André

AU - Cristo, Marco

AU - Moura, Edleno

AU - Calado, Pavel

AU - Chirita, Paul Alexandru

AU - Nejdl, Wolfgang

PY - 2012/10/16

Y1 - 2012/10/16

N2 - Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.

AB - Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.

KW - information retrieval software

KW - information storage and retrieval systems

KW - search engines

UR - http://www.scopus.com/inward/record.url?scp=84868203478&partnerID=8YFLogxK

U2 - 10.1002/asi.22729

DO - 10.1002/asi.22729

M3 - Article

AN - SCOPUS:84868203478

VL - 63

SP - 2294

EP - 2312

JO - Journal of the American Society for Information Science and Technology

JF - Journal of the American Society for Information Science and Technology

SN - 1532-2882

IS - 11

ER -

Von denselben Autoren