An architecture for finding entities on the web

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Gianluca Demartini
  • Claudiu S. Firan
  • Mihai Georgescu
  • Tereza Iofciu
  • Ralf Krestel
  • Wolfgang Nejdl

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publication2009 Latin American Web Congress
Subtitle of host publicationJoint LA-WEB/CLIHC Conference
Pages230-237
Number of pages8
Publication statusPublished - 1 Dec 2009
Event2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference - Merida, Yucatan, Mexico
Duration: 9 Nov 200911 Nov 2009

Publication series

Name2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference

Abstract

Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web users. For example, systems that retrieve entities instead of just documents have been built. In this paper we present an approach for large-scale Entity Retrieval using web collections as underlying corpus. We propose an architecture for entity extraction and entity ranking starting from web documents. This is obtained (1) using an existing web document index and (2) creating an entity centric index. We describe advantages and feasibility of our approach using state-of-the-art tools.

Keywords

    Entity retrieval, Natural language processing, Web search

ASJC Scopus subject areas

Cite this

An architecture for finding entities on the web. / Demartini, Gianluca; Firan, Claudiu S.; Georgescu, Mihai et al.
2009 Latin American Web Congress: Joint LA-WEB/CLIHC Conference. 2009. p. 230-237 5341521 (2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Demartini, G, Firan, CS, Georgescu, M, Iofciu, T, Krestel, R & Nejdl, W 2009, An architecture for finding entities on the web. in 2009 Latin American Web Congress: Joint LA-WEB/CLIHC Conference., 5341521, 2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference, pp. 230-237, 2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference, Merida, Yucatan, Mexico, 9 Nov 2009. https://doi.org/10.1109/LA-WEB.2009.14
Demartini, G., Firan, C. S., Georgescu, M., Iofciu, T., Krestel, R., & Nejdl, W. (2009). An architecture for finding entities on the web. In 2009 Latin American Web Congress: Joint LA-WEB/CLIHC Conference (pp. 230-237). Article 5341521 (2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference). https://doi.org/10.1109/LA-WEB.2009.14
Demartini G, Firan CS, Georgescu M, Iofciu T, Krestel R, Nejdl W. An architecture for finding entities on the web. In 2009 Latin American Web Congress: Joint LA-WEB/CLIHC Conference. 2009. p. 230-237. 5341521. (2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference). doi: 10.1109/LA-WEB.2009.14
Demartini, Gianluca ; Firan, Claudiu S. ; Georgescu, Mihai et al. / An architecture for finding entities on the web. 2009 Latin American Web Congress: Joint LA-WEB/CLIHC Conference. 2009. pp. 230-237 (2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference).
Download
@inproceedings{828d3be6e9454552b97b0447294410a3,
title = "An architecture for finding entities on the web",
abstract = "Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web users. For example, systems that retrieve entities instead of just documents have been built. In this paper we present an approach for large-scale Entity Retrieval using web collections as underlying corpus. We propose an architecture for entity extraction and entity ranking starting from web documents. This is obtained (1) using an existing web document index and (2) creating an entity centric index. We describe advantages and feasibility of our approach using state-of-the-art tools.",
keywords = "Entity retrieval, Natural language processing, Web search",
author = "Gianluca Demartini and Firan, {Claudiu S.} and Mihai Georgescu and Tereza Iofciu and Ralf Krestel and Wolfgang Nejdl",
year = "2009",
month = dec,
day = "1",
doi = "10.1109/LA-WEB.2009.14",
language = "English",
isbn = "9780769538563",
series = "2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference",
pages = "230--237",
booktitle = "2009 Latin American Web Congress",
note = "2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference ; Conference date: 09-11-2009 Through 11-11-2009",

}

Download

TY - GEN

T1 - An architecture for finding entities on the web

AU - Demartini, Gianluca

AU - Firan, Claudiu S.

AU - Georgescu, Mihai

AU - Iofciu, Tereza

AU - Krestel, Ralf

AU - Nejdl, Wolfgang

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web users. For example, systems that retrieve entities instead of just documents have been built. In this paper we present an approach for large-scale Entity Retrieval using web collections as underlying corpus. We propose an architecture for entity extraction and entity ranking starting from web documents. This is obtained (1) using an existing web document index and (2) creating an entity centric index. We describe advantages and feasibility of our approach using state-of-the-art tools.

AB - Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web users. For example, systems that retrieve entities instead of just documents have been built. In this paper we present an approach for large-scale Entity Retrieval using web collections as underlying corpus. We propose an architecture for entity extraction and entity ranking starting from web documents. This is obtained (1) using an existing web document index and (2) creating an entity centric index. We describe advantages and feasibility of our approach using state-of-the-art tools.

KW - Entity retrieval

KW - Natural language processing

KW - Web search

UR - http://www.scopus.com/inward/record.url?scp=72449182171&partnerID=8YFLogxK

U2 - 10.1109/LA-WEB.2009.14

DO - 10.1109/LA-WEB.2009.14

M3 - Conference contribution

AN - SCOPUS:72449182171

SN - 9780769538563

T3 - 2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference

SP - 230

EP - 237

BT - 2009 Latin American Web Congress

T2 - 2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference

Y2 - 9 November 2009 through 11 November 2009

ER -

By the same author(s)