Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries |
Seiten | 95-104 |
Seitenumfang | 10 |
Publikationsstatus | Veröffentlicht - 13 Juni 2011 |
Veranstaltung | 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11 - Ottawa, ON, Kanada Dauer: 13 Juni 2011 → 17 Juni 2011 |
Publikationsreihe
Name | Proceedings of the ACM/IEEE Joint Conference on Digital Libraries |
---|---|
ISSN (Print) | 1552-5996 |
Abstract
Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.
ASJC Scopus Sachgebiete
- Ingenieurwesen (insg.)
- Allgemeiner Maschinenbau
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries. 2011. S. 95-104 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Detecting and Exploiting Stability in Evolving Heterogeneous Information Space
AU - Papadakis, George
AU - Giannakopoulos, George
AU - Niederée, Claudia
AU - Palpanas, Themis
AU - Nejdl, Wolfgang
PY - 2011/6/13
Y1 - 2011/6/13
N2 - Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.
AB - Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.
KW - entity evolution
KW - n-gram graphs
KW - stability detection
UR - http://www.scopus.com/inward/record.url?scp=79960545856&partnerID=8YFLogxK
U2 - 10.1145/1998076.1998094
DO - 10.1145/1998076.1998094
M3 - Conference contribution
AN - SCOPUS:79960545856
SN - 9781450307444
T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
SP - 95
EP - 104
BT - JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries
T2 - 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11
Y2 - 13 June 2011 through 17 June 2011
ER -