Exploiting Entity Information for Stream Classification over a Stream of Reviews

Christian Beyer; Pawel Matuszyk; Vishnu Unnikrishnan; Eirini Ntoutsi; Uli Niemann; Myra Spiliopoulou

doi:10.1145/3297280.3297333

Details

Originalsprache	Englisch
Titel des Sammelwerks	SAC '19
Untertitel	Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing
Seiten	564-573
Seitenumfang	10
Publikationsstatus	Veröffentlicht - 8 Apr. 2019
Veranstaltung	34th Annual ACM Symposium on Applied Computing, SAC 2019 - Limassol, Zypern Dauer: 8 Apr. 2019 → 12 Apr. 2019

Abstract

Opinion stream classification algorithms adapt the model to the arriving review texts and, depending on the forgetting scheme, reduce the contribution old reviews have upon the model. Reviews are assumed independent, and information on the entity to which a review refers, i.e. to the opinion target, is thereby ignored. This implies that the prediction of a review's label is based more on reviews referring to other, more popular or simply more recently inspected entities, while reviews referring to the same entity might be ignored as too old. In this study, we enforce that the reviews to each entity are taken into account for learning, adaption and forgetting. We split the original stream to substreams, each substream comprised by the reviews referring to the same entity (opinion target). This allows us to deal with differences in the speed of each substream and to exploit the impact of the entity itself on the labels of the reviews referring to it. For this constellation of substreams we propose a pair of two voting classifiers, one being the global, “entity-ignorant” classifier trained on the whole stream of reviews, the other one consisting of one “entity-centric” classifier per entity. We show that the entity-ignorant classifier contributes most for entities with very few reviews, i.e. during the cold-start, while the entity-centric classifiers contribute most after acquiring enough information on the corresponding entities. We study our approach on a stream of product reviews, show that our ensemble improves the performance of its members, and we discuss the conditions under which one member contributes more than the other.

ASJC Scopus Sachgebiete

Informatik (insg.)
Software

Zitieren

Exploiting Entity Information for Stream Classification over a Stream of Reviews. / Beyer, Christian; Matuszyk, Pawel; Unnikrishnan, Vishnu et al.
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019. S. 564-573.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Beyer, C, Matuszyk, P, Unnikrishnan, V, Ntoutsi, E, Niemann, U & Spiliopoulou, M 2019, Exploiting Entity Information for Stream Classification over a Stream of Reviews. in SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. S. 564-573, 34th Annual ACM Symposium on Applied Computing, SAC 2019, Limassol, Zypern, 8 Apr. 2019. https://doi.org/10.1145/3297280.3297333

Beyer, C., Matuszyk, P., Unnikrishnan, V., Ntoutsi, E., Niemann, U., & Spiliopoulou, M. (2019). Exploiting Entity Information for Stream Classification over a Stream of Reviews. In SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing (S. 564-573) https://doi.org/10.1145/3297280.3297333

Beyer C, Matuszyk P, Unnikrishnan V, Ntoutsi E, Niemann U, Spiliopoulou M. Exploiting Entity Information for Stream Classification over a Stream of Reviews. in SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019. S. 564-573 doi: 10.1145/3297280.3297333

Beyer, Christian ; Matuszyk, Pawel ; Unnikrishnan, Vishnu et al. / Exploiting Entity Information for Stream Classification over a Stream of Reviews. SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019. S. 564-573

Download

@inproceedings{fdd928539bc448c4b294be4211bfe6ab,

title = "Exploiting Entity Information for Stream Classification over a Stream of Reviews",

abstract = "Opinion stream classification algorithms adapt the model to the arriving review texts and, depending on the forgetting scheme, reduce the contribution old reviews have upon the model. Reviews are assumed independent, and information on the entity to which a review refers, i.e. to the opinion target, is thereby ignored. This implies that the prediction of a review's label is based more on reviews referring to other, more popular or simply more recently inspected entities, while reviews referring to the same entity might be ignored as too old. In this study, we enforce that the reviews to each entity are taken into account for learning, adaption and forgetting. We split the original stream to substreams, each substream comprised by the reviews referring to the same entity (opinion target). This allows us to deal with differences in the speed of each substream and to exploit the impact of the entity itself on the labels of the reviews referring to it. For this constellation of substreams we propose a pair of two voting classifiers, one being the global, “entity-ignorant” classifier trained on the whole stream of reviews, the other one consisting of one “entity-centric” classifier per entity. We show that the entity-ignorant classifier contributes most for entities with very few reviews, i.e. during the cold-start, while the entity-centric classifiers contribute most after acquiring enough information on the corresponding entities. We study our approach on a stream of product reviews, show that our ensemble improves the performance of its members, and we discuss the conditions under which one member contributes more than the other.",

keywords = "Document Prediction, Entity-Centric Learning, Stream Classification",

author = "Christian Beyer and Pawel Matuszyk and Vishnu Unnikrishnan and Eirini Ntoutsi and Uli Niemann and Myra Spiliopoulou",

note = "Funding Information: This work is partially funded by the German Research Foundation, project OSCAR {"}Opinion Stream Classification with Ensembles and Active Learners{"}. Additionally, the first author is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.; 34th Annual ACM Symposium on Applied Computing, SAC 2019 ; Conference date: 08-04-2019 Through 12-04-2019",

year = "2019",

month = apr,

day = "8",

doi = "10.1145/3297280.3297333",

language = "English",

isbn = "9781450359337",

pages = "564--573",

booktitle = "SAC '19",

}

Download

TY - GEN

T1 - Exploiting Entity Information for Stream Classification over a Stream of Reviews

AU - Beyer, Christian

AU - Matuszyk, Pawel

AU - Unnikrishnan, Vishnu

AU - Ntoutsi, Eirini

AU - Niemann, Uli

AU - Spiliopoulou, Myra

N1 - Funding Information: This work is partially funded by the German Research Foundation, project OSCAR "Opinion Stream Classification with Ensembles and Active Learners". Additionally, the first author is also partially funded by a PhD grant from the federal state of Saxony-Anhalt.

PY - 2019/4/8

Y1 - 2019/4/8

N2 - Opinion stream classification algorithms adapt the model to the arriving review texts and, depending on the forgetting scheme, reduce the contribution old reviews have upon the model. Reviews are assumed independent, and information on the entity to which a review refers, i.e. to the opinion target, is thereby ignored. This implies that the prediction of a review's label is based more on reviews referring to other, more popular or simply more recently inspected entities, while reviews referring to the same entity might be ignored as too old. In this study, we enforce that the reviews to each entity are taken into account for learning, adaption and forgetting. We split the original stream to substreams, each substream comprised by the reviews referring to the same entity (opinion target). This allows us to deal with differences in the speed of each substream and to exploit the impact of the entity itself on the labels of the reviews referring to it. For this constellation of substreams we propose a pair of two voting classifiers, one being the global, “entity-ignorant” classifier trained on the whole stream of reviews, the other one consisting of one “entity-centric” classifier per entity. We show that the entity-ignorant classifier contributes most for entities with very few reviews, i.e. during the cold-start, while the entity-centric classifiers contribute most after acquiring enough information on the corresponding entities. We study our approach on a stream of product reviews, show that our ensemble improves the performance of its members, and we discuss the conditions under which one member contributes more than the other.

AB - Opinion stream classification algorithms adapt the model to the arriving review texts and, depending on the forgetting scheme, reduce the contribution old reviews have upon the model. Reviews are assumed independent, and information on the entity to which a review refers, i.e. to the opinion target, is thereby ignored. This implies that the prediction of a review's label is based more on reviews referring to other, more popular or simply more recently inspected entities, while reviews referring to the same entity might be ignored as too old. In this study, we enforce that the reviews to each entity are taken into account for learning, adaption and forgetting. We split the original stream to substreams, each substream comprised by the reviews referring to the same entity (opinion target). This allows us to deal with differences in the speed of each substream and to exploit the impact of the entity itself on the labels of the reviews referring to it. For this constellation of substreams we propose a pair of two voting classifiers, one being the global, “entity-ignorant” classifier trained on the whole stream of reviews, the other one consisting of one “entity-centric” classifier per entity. We show that the entity-ignorant classifier contributes most for entities with very few reviews, i.e. during the cold-start, while the entity-centric classifiers contribute most after acquiring enough information on the corresponding entities. We study our approach on a stream of product reviews, show that our ensemble improves the performance of its members, and we discuss the conditions under which one member contributes more than the other.

KW - Document Prediction

KW - Entity-Centric Learning

KW - Stream Classification

UR - http://www.scopus.com/inward/record.url?scp=85065637268&partnerID=8YFLogxK

U2 - 10.1145/3297280.3297333

DO - 10.1145/3297280.3297333

M3 - Conference contribution

AN - SCOPUS:85065637268

SN - 9781450359337

SP - 564

EP - 573

BT - SAC '19

T2 - 34th Annual ACM Symposium on Applied Computing, SAC 2019

Y2 - 8 April 2019 through 12 April 2019

ER -

Research@Leibniz University

Exploiting Entity Information for Stream Classification over a Stream of Reviews

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren