Efficient Entity Resolution Methods for Heterogeneous Information Spaces

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops
Seiten304-307
Seitenumfang4
PublikationsstatusVeröffentlicht - 10 Juni 2011
Veranstaltung2011 IEEE 27th International Conference on Data Engineering Workshops, ICDE 2011 - Hannover, Deutschland
Dauer: 11 Apr. 201116 Apr. 2011

Publikationsreihe

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Abstract

The Web of Data encompasses a voluminous, yet constantly expanding collection of structured and semi-structured data sets. An important prerequisite for leveraging on them is the detection (and merge) of information that describe the same real-world entities, a task known as Entity Resolution. To enhance the efficiency of this quadratic task, blocking techniques are typically employed. They are, however, inapplicable to the Web of Data, due to the noise, the loose schema binding as well as the unprecedented heterogeneity inherent in it. In the context of my thesis, I focus on developing novel blocking methods that scale up Entity Resolution within such large, noisy, and heterogeneous information spaces. At their core lies an attribute-agnostic mechanism that relies exclusively on the values of entity profiles in order to build blocks effectively. The resulting set of blocks is processed efficiently by intelligent techniques that minimize the required number of comparisons. Any combination of block building and block processing methods is possible, allowing for high flexibility of the overall approach. Initial experimental studies on large, real-world data sets have produced quite promising results.

ASJC Scopus Sachgebiete

Zitieren

Efficient Entity Resolution Methods for Heterogeneous Information Spaces. / Papadakis, George; Nejdl, Wolfgang.
ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops. 2011. S. 304-307 5767671 (Proceedings - International Conference on Data Engineering).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Papadakis, G & Nejdl, W 2011, Efficient Entity Resolution Methods for Heterogeneous Information Spaces. in ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops., 5767671, Proceedings - International Conference on Data Engineering, S. 304-307, 2011 IEEE 27th International Conference on Data Engineering Workshops, ICDE 2011, Hannover, Deutschland, 11 Apr. 2011. https://doi.org/10.1109/ICDEW.2011.5767671
Papadakis, G., & Nejdl, W. (2011). Efficient Entity Resolution Methods for Heterogeneous Information Spaces. In ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops (S. 304-307). Artikel 5767671 (Proceedings - International Conference on Data Engineering). https://doi.org/10.1109/ICDEW.2011.5767671
Papadakis G, Nejdl W. Efficient Entity Resolution Methods for Heterogeneous Information Spaces. in ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops. 2011. S. 304-307. 5767671. (Proceedings - International Conference on Data Engineering). doi: 10.1109/ICDEW.2011.5767671
Papadakis, George ; Nejdl, Wolfgang. / Efficient Entity Resolution Methods for Heterogeneous Information Spaces. ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops. 2011. S. 304-307 (Proceedings - International Conference on Data Engineering).
Download
@inproceedings{bc0e6bade362404db4ffee73dc2dc3cd,
title = "Efficient Entity Resolution Methods for Heterogeneous Information Spaces",
abstract = "The Web of Data encompasses a voluminous, yet constantly expanding collection of structured and semi-structured data sets. An important prerequisite for leveraging on them is the detection (and merge) of information that describe the same real-world entities, a task known as Entity Resolution. To enhance the efficiency of this quadratic task, blocking techniques are typically employed. They are, however, inapplicable to the Web of Data, due to the noise, the loose schema binding as well as the unprecedented heterogeneity inherent in it. In the context of my thesis, I focus on developing novel blocking methods that scale up Entity Resolution within such large, noisy, and heterogeneous information spaces. At their core lies an attribute-agnostic mechanism that relies exclusively on the values of entity profiles in order to build blocks effectively. The resulting set of blocks is processed efficiently by intelligent techniques that minimize the required number of comparisons. Any combination of block building and block processing methods is possible, allowing for high flexibility of the overall approach. Initial experimental studies on large, real-world data sets have produced quite promising results.",
author = "George Papadakis and Wolfgang Nejdl",
year = "2011",
month = jun,
day = "10",
doi = "10.1109/ICDEW.2011.5767671",
language = "English",
isbn = "9781424491940",
series = "Proceedings - International Conference on Data Engineering",
pages = "304--307",
booktitle = "ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops",
note = "2011 IEEE 27th International Conference on Data Engineering Workshops, ICDE 2011 ; Conference date: 11-04-2011 Through 16-04-2011",

}

Download

TY - GEN

T1 - Efficient Entity Resolution Methods for Heterogeneous Information Spaces

AU - Papadakis, George

AU - Nejdl, Wolfgang

PY - 2011/6/10

Y1 - 2011/6/10

N2 - The Web of Data encompasses a voluminous, yet constantly expanding collection of structured and semi-structured data sets. An important prerequisite for leveraging on them is the detection (and merge) of information that describe the same real-world entities, a task known as Entity Resolution. To enhance the efficiency of this quadratic task, blocking techniques are typically employed. They are, however, inapplicable to the Web of Data, due to the noise, the loose schema binding as well as the unprecedented heterogeneity inherent in it. In the context of my thesis, I focus on developing novel blocking methods that scale up Entity Resolution within such large, noisy, and heterogeneous information spaces. At their core lies an attribute-agnostic mechanism that relies exclusively on the values of entity profiles in order to build blocks effectively. The resulting set of blocks is processed efficiently by intelligent techniques that minimize the required number of comparisons. Any combination of block building and block processing methods is possible, allowing for high flexibility of the overall approach. Initial experimental studies on large, real-world data sets have produced quite promising results.

AB - The Web of Data encompasses a voluminous, yet constantly expanding collection of structured and semi-structured data sets. An important prerequisite for leveraging on them is the detection (and merge) of information that describe the same real-world entities, a task known as Entity Resolution. To enhance the efficiency of this quadratic task, blocking techniques are typically employed. They are, however, inapplicable to the Web of Data, due to the noise, the loose schema binding as well as the unprecedented heterogeneity inherent in it. In the context of my thesis, I focus on developing novel blocking methods that scale up Entity Resolution within such large, noisy, and heterogeneous information spaces. At their core lies an attribute-agnostic mechanism that relies exclusively on the values of entity profiles in order to build blocks effectively. The resulting set of blocks is processed efficiently by intelligent techniques that minimize the required number of comparisons. Any combination of block building and block processing methods is possible, allowing for high flexibility of the overall approach. Initial experimental studies on large, real-world data sets have produced quite promising results.

UR - http://www.scopus.com/inward/record.url?scp=79958070483&partnerID=8YFLogxK

U2 - 10.1109/ICDEW.2011.5767671

DO - 10.1109/ICDEW.2011.5767671

M3 - Conference contribution

AN - SCOPUS:79958070483

SN - 9781424491940

T3 - Proceedings - International Conference on Data Engineering

SP - 304

EP - 307

BT - ICDE Workshops 2011 - 2011 IEEE 27th International Conference on Data Engineering Workshops

T2 - 2011 IEEE 27th International Conference on Data Engineering Workshops, ICDE 2011

Y2 - 11 April 2011 through 16 April 2011

ER -

Von denselben Autoren