Detecting and Exploiting Stability in Evolving Heterogeneous Information Space

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • George Papadakis
  • George Giannakopoulos
  • Claudia Niederée
  • Themis Palpanas
  • Wolfgang Nejdl

Research Organisations

External Research Organisations

  • National Technical University of Athens (NTUA)
  • National Centre For Scientific Research Demokritos (NCSR Demokritos)
  • University of Trento
View graph of relations

Details

Original languageEnglish
Title of host publicationJCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries
Pages95-104
Number of pages10
Publication statusPublished - 13 Jun 2011
Event11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11 - Ottawa, ON, Canada
Duration: 13 Jun 201117 Jun 2011

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Abstract

Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.

Keywords

    entity evolution, n-gram graphs, stability detection

ASJC Scopus subject areas

Cite this

Detecting and Exploiting Stability in Evolving Heterogeneous Information Space. / Papadakis, George; Giannakopoulos, George; Niederée, Claudia et al.
JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries. 2011. p. 95-104 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Papadakis, G, Giannakopoulos, G, Niederée, C, Palpanas, T & Nejdl, W 2011, Detecting and Exploiting Stability in Evolving Heterogeneous Information Space. in JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, pp. 95-104, 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11, Ottawa, ON, Canada, 13 Jun 2011. https://doi.org/10.1145/1998076.1998094
Papadakis, G., Giannakopoulos, G., Niederée, C., Palpanas, T., & Nejdl, W. (2011). Detecting and Exploiting Stability in Evolving Heterogeneous Information Space. In JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries (pp. 95-104). (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). https://doi.org/10.1145/1998076.1998094
Papadakis G, Giannakopoulos G, Niederée C, Palpanas T, Nejdl W. Detecting and Exploiting Stability in Evolving Heterogeneous Information Space. In JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries. 2011. p. 95-104. (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). doi: 10.1145/1998076.1998094
Papadakis, George ; Giannakopoulos, George ; Niederée, Claudia et al. / Detecting and Exploiting Stability in Evolving Heterogeneous Information Space. JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries. 2011. pp. 95-104 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).
Download
@inproceedings{f6354293e50748dc901213a8c837e87e,
title = "Detecting and Exploiting Stability in Evolving Heterogeneous Information Space",
abstract = "Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.",
keywords = "entity evolution, n-gram graphs, stability detection",
author = "George Papadakis and George Giannakopoulos and Claudia Nieder{\'e}e and Themis Palpanas and Wolfgang Nejdl",
year = "2011",
month = jun,
day = "13",
doi = "10.1145/1998076.1998094",
language = "English",
isbn = "9781450307444",
series = "Proceedings of the ACM/IEEE Joint Conference on Digital Libraries",
pages = "95--104",
booktitle = "JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries",
note = "11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11 ; Conference date: 13-06-2011 Through 17-06-2011",

}

Download

TY - GEN

T1 - Detecting and Exploiting Stability in Evolving Heterogeneous Information Space

AU - Papadakis, George

AU - Giannakopoulos, George

AU - Niederée, Claudia

AU - Palpanas, Themis

AU - Nejdl, Wolfgang

PY - 2011/6/13

Y1 - 2011/6/13

N2 - Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.

AB - Individuals contribute content on the Web at an unprecedented rate, accumulating immense quantities of (semi-)structured data. Wisdom of the Crowds theory advocates that such information (or parts of it) is constantly overwritten, updated, or even deleted by other users, with the goal of rendering it more accurate, or up-to-date. This is particularly true for the collaboratively edited, semi-structured data of entity repositories, whose entity profiles are consistently kept fresh. Therefore, their core information that remain stable with the passage of time, despite being reviewed by numerous users, are particularly useful for the description of an entity. Based on the above hypothesis, we introduce a classification scheme that predicts, on the basis of statistical and content patterns, whether an attribute (i.e., name-value pair) is going to be modified in the future. We apply our scheme on a large, real-world, versioned dataset and verify its effectiveness. Our thorough experimental study also suggests that reducing entity profiles to their stable parts conveys significant benefits to two common tasks in computer science: information retrieval and information integration.

KW - entity evolution

KW - n-gram graphs

KW - stability detection

UR - http://www.scopus.com/inward/record.url?scp=79960545856&partnerID=8YFLogxK

U2 - 10.1145/1998076.1998094

DO - 10.1145/1998076.1998094

M3 - Conference contribution

AN - SCOPUS:79960545856

SN - 9781450307444

T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

SP - 95

EP - 104

BT - JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries

T2 - 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11

Y2 - 13 June 2011 through 17 June 2011

ER -

By the same author(s)