Details
Original language | English |
---|---|
Pages (from-to) | 2294-2312 |
Number of pages | 19 |
Journal | Journal of the American Society for Information Science and Technology |
Volume | 63 |
Issue number | 11 |
Publication status | Published - 16 Oct 2012 |
Abstract
Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.
Keywords
- information retrieval software, information storage and retrieval systems, search engines
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Information Systems
- Computer Science(all)
- Human-Computer Interaction
- Computer Science(all)
- Computer Networks and Communications
- Computer Science(all)
- Artificial Intelligence
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Journal of the American Society for Information Science and Technology, Vol. 63, No. 11, 16.10.2012, p. 2294-2312.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Using Site-Level Connections to Estimate Link Confidence
AU - Souza, Jucimar
AU - Carvalho, André
AU - Cristo, Marco
AU - Moura, Edleno
AU - Calado, Pavel
AU - Chirita, Paul Alexandru
AU - Nejdl, Wolfgang
PY - 2012/10/16
Y1 - 2012/10/16
N2 - Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.
AB - Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.
KW - information retrieval software
KW - information storage and retrieval systems
KW - search engines
UR - http://www.scopus.com/inward/record.url?scp=84868203478&partnerID=8YFLogxK
U2 - 10.1002/asi.22729
DO - 10.1002/asi.22729
M3 - Article
AN - SCOPUS:84868203478
VL - 63
SP - 2294
EP - 2312
JO - Journal of the American Society for Information Science and Technology
JF - Journal of the American Society for Information Science and Technology
SN - 1532-2882
IS - 11
ER -