Details
Original language | English |
---|---|
Pages (from-to) | 429-438 |
Number of pages | 10 |
Journal | Proceedings of the VLDB Endowment |
Volume | 3 |
Issue number | 1 |
Publication status | Published - Sept 2010 |
Abstract
Entity linkage is central to almost every data integration and data cleaning scenario. Traditional techniques use some computed similarity among data structure to perform merges and then answer queries on the merged data. We describe a novel framework for entity linkage with uncertainty. Instead of using the linkage information to merge structures a-priori, possible linkages are stored alongside the data with their belief value. A new probabilistic query answering technique is used to take the probabilistic linkage into consideration. The framework introduces a series of novelties: (i) it performs merges at run time based not only on existing linkages but also on the given query; (ii) it allows results that may contain structures not explicitly represented in the data, but generated as a result of a reasoning on the linkages; and (iii) enables an evaluation of the query conditions that spans across linked structures, offering a functionality not currently supported by any traditional probabilistic databases. We formally define the semantics, describe an efficient implementation and report on the findings of our experimental evaluation.
ASJC Scopus subject areas
- Computer Science(all)
- Computer Science (miscellaneous)
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Proceedings of the VLDB Endowment, Vol. 3, No. 1, 09.2010, p. 429-438.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - On-the-Fly Entity-Aware Query Processing in the Presence of Linkage
AU - Ioannou, Ekaterini
AU - Nejdl, Wolfgang
AU - Niederée, Claudia
AU - Velegrakis, Yannis
PY - 2010/9
Y1 - 2010/9
N2 - Entity linkage is central to almost every data integration and data cleaning scenario. Traditional techniques use some computed similarity among data structure to perform merges and then answer queries on the merged data. We describe a novel framework for entity linkage with uncertainty. Instead of using the linkage information to merge structures a-priori, possible linkages are stored alongside the data with their belief value. A new probabilistic query answering technique is used to take the probabilistic linkage into consideration. The framework introduces a series of novelties: (i) it performs merges at run time based not only on existing linkages but also on the given query; (ii) it allows results that may contain structures not explicitly represented in the data, but generated as a result of a reasoning on the linkages; and (iii) enables an evaluation of the query conditions that spans across linked structures, offering a functionality not currently supported by any traditional probabilistic databases. We formally define the semantics, describe an efficient implementation and report on the findings of our experimental evaluation.
AB - Entity linkage is central to almost every data integration and data cleaning scenario. Traditional techniques use some computed similarity among data structure to perform merges and then answer queries on the merged data. We describe a novel framework for entity linkage with uncertainty. Instead of using the linkage information to merge structures a-priori, possible linkages are stored alongside the data with their belief value. A new probabilistic query answering technique is used to take the probabilistic linkage into consideration. The framework introduces a series of novelties: (i) it performs merges at run time based not only on existing linkages but also on the given query; (ii) it allows results that may contain structures not explicitly represented in the data, but generated as a result of a reasoning on the linkages; and (iii) enables an evaluation of the query conditions that spans across linked structures, offering a functionality not currently supported by any traditional probabilistic databases. We formally define the semantics, describe an efficient implementation and report on the findings of our experimental evaluation.
UR - http://www.scopus.com/inward/record.url?scp=79959927816&partnerID=8YFLogxK
U2 - 10.14778/1920841.1920898
DO - 10.14778/1920841.1920898
M3 - Article
AN - SCOPUS:79959927816
VL - 3
SP - 429
EP - 438
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 1
ER -