On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

View graph of relations

Details

Original languageEnglish
Title of host publicationProduct-Focused Software Process Improvement
EditorsDavide Taibi, Marco Kuhrmann, Tommi Mikkonen, Pekka Abrahamsson, Jil Klünder
Place of PublicationCham
PublisherSpringer International Publishing AG
Pages108-123
Number of pages16
ISBN (electronic)978-3-031-21388-5
ISBN (print)978-3-031-21387-8
Publication statusPublished - 14 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13709 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

A positive working climate is essential in modern software development. It enhances productivity since a satisfied developer tends to deliver better results. Sentiment analysis tools are a means to analyze and classify textual communication between developers according to the polarity of the statements. Most of these tools deliver promising results when used with test data from the domain they are developed for (e.g., GitHub). But the tools' outcomes lack reliability when used in a different domain (e.g., Stack Overflow). One possible way to mitigate this problem is to combine different tools trained in different domains. In this paper, we analyze a combination of three sentiment analysis tools in a voting classifier according to their reliability and performance. The tools are trained and evaluated using five already existing polarity data sets (e.g. from GitHub). The results indicate that this kind of combination of tools is a good choice in the within-platform setting. However, a majority vote does not necessarily lead to better results when applying in cross-platform domains. In most cases, the best individual tool in the ensemble is preferable. This is mainly due to the often large difference in performance of the individual tools, even on the same data set. However, this may also be due to the different annotated data sets.

Keywords

    Cross-platform setting, Development team, Machine learning, Majority voting, Sentiment analysis

ASJC Scopus subject areas

Cite this

On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting. / Obaidi, Martin; Holm, Henrik; Schneider, Kurt et al.
Product-Focused Software Process Improvement. ed. / Davide Taibi; Marco Kuhrmann; Tommi Mikkonen; Pekka Abrahamsson; Jil Klünder. Cham: Springer International Publishing AG, 2022. p. 108-123 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13709 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Obaidi, M, Holm, H, Schneider, K & Klünder, J 2022, On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting. in D Taibi, M Kuhrmann, T Mikkonen, P Abrahamsson & J Klünder (eds), Product-Focused Software Process Improvement. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13709 LNCS, Springer International Publishing AG, Cham, pp. 108-123. https://doi.org/10.1007/978-3-031-21388-5_8
Obaidi, M., Holm, H., Schneider, K., & Klünder, J. (2022). On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting. In D. Taibi, M. Kuhrmann, T. Mikkonen, P. Abrahamsson, & J. Klünder (Eds.), Product-Focused Software Process Improvement (pp. 108-123). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13709 LNCS). Springer International Publishing AG. https://doi.org/10.1007/978-3-031-21388-5_8
Obaidi M, Holm H, Schneider K, Klünder J. On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting. In Taibi D, Kuhrmann M, Mikkonen T, Abrahamsson P, Klünder J, editors, Product-Focused Software Process Improvement. Cham: Springer International Publishing AG. 2022. p. 108-123. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-21388-5_8
Obaidi, Martin ; Holm, Henrik ; Schneider, Kurt et al. / On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting. Product-Focused Software Process Improvement. editor / Davide Taibi ; Marco Kuhrmann ; Tommi Mikkonen ; Pekka Abrahamsson ; Jil Klünder. Cham : Springer International Publishing AG, 2022. pp. 108-123 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{e4202e29bab34a4e8e1dc62c373e76b9,
title = "On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting",
abstract = "A positive working climate is essential in modern software development. It enhances productivity since a satisfied developer tends to deliver better results. Sentiment analysis tools are a means to analyze and classify textual communication between developers according to the polarity of the statements. Most of these tools deliver promising results when used with test data from the domain they are developed for (e.g., GitHub). But the tools' outcomes lack reliability when used in a different domain (e.g., Stack Overflow). One possible way to mitigate this problem is to combine different tools trained in different domains. In this paper, we analyze a combination of three sentiment analysis tools in a voting classifier according to their reliability and performance. The tools are trained and evaluated using five already existing polarity data sets (e.g. from GitHub). The results indicate that this kind of combination of tools is a good choice in the within-platform setting. However, a majority vote does not necessarily lead to better results when applying in cross-platform domains. In most cases, the best individual tool in the ensemble is preferable. This is mainly due to the often large difference in performance of the individual tools, even on the same data set. However, this may also be due to the different annotated data sets.",
keywords = "Cross-platform setting, Development team, Machine learning, Majority voting, Sentiment analysis",
author = "Martin Obaidi and Henrik Holm and Kurt Schneider and Jil Kl{\"u}nder",
note = "Funding Information: This research was funded by the Leibniz University Hannover as a Leibniz Young Investigator Grant (Project ComContA, Project Number 85430128, 2020–2022).",
year = "2022",
month = nov,
day = "14",
doi = "10.1007/978-3-031-21388-5_8",
language = "English",
isbn = "978-3-031-21387-8",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer International Publishing AG",
pages = "108--123",
editor = "Davide Taibi and Marco Kuhrmann and Tommi Mikkonen and Pekka Abrahamsson and Jil Kl{\"u}nder",
booktitle = "Product-Focused Software Process Improvement",
address = "Switzerland",

}

Download

TY - GEN

T1 - On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting

AU - Obaidi, Martin

AU - Holm, Henrik

AU - Schneider, Kurt

AU - Klünder, Jil

N1 - Funding Information: This research was funded by the Leibniz University Hannover as a Leibniz Young Investigator Grant (Project ComContA, Project Number 85430128, 2020–2022).

PY - 2022/11/14

Y1 - 2022/11/14

N2 - A positive working climate is essential in modern software development. It enhances productivity since a satisfied developer tends to deliver better results. Sentiment analysis tools are a means to analyze and classify textual communication between developers according to the polarity of the statements. Most of these tools deliver promising results when used with test data from the domain they are developed for (e.g., GitHub). But the tools' outcomes lack reliability when used in a different domain (e.g., Stack Overflow). One possible way to mitigate this problem is to combine different tools trained in different domains. In this paper, we analyze a combination of three sentiment analysis tools in a voting classifier according to their reliability and performance. The tools are trained and evaluated using five already existing polarity data sets (e.g. from GitHub). The results indicate that this kind of combination of tools is a good choice in the within-platform setting. However, a majority vote does not necessarily lead to better results when applying in cross-platform domains. In most cases, the best individual tool in the ensemble is preferable. This is mainly due to the often large difference in performance of the individual tools, even on the same data set. However, this may also be due to the different annotated data sets.

AB - A positive working climate is essential in modern software development. It enhances productivity since a satisfied developer tends to deliver better results. Sentiment analysis tools are a means to analyze and classify textual communication between developers according to the polarity of the statements. Most of these tools deliver promising results when used with test data from the domain they are developed for (e.g., GitHub). But the tools' outcomes lack reliability when used in a different domain (e.g., Stack Overflow). One possible way to mitigate this problem is to combine different tools trained in different domains. In this paper, we analyze a combination of three sentiment analysis tools in a voting classifier according to their reliability and performance. The tools are trained and evaluated using five already existing polarity data sets (e.g. from GitHub). The results indicate that this kind of combination of tools is a good choice in the within-platform setting. However, a majority vote does not necessarily lead to better results when applying in cross-platform domains. In most cases, the best individual tool in the ensemble is preferable. This is mainly due to the often large difference in performance of the individual tools, even on the same data set. However, this may also be due to the different annotated data sets.

KW - Cross-platform setting

KW - Development team

KW - Machine learning

KW - Majority voting

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=85142722433&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-21388-5_8

DO - 10.1007/978-3-031-21388-5_8

M3 - Conference contribution

SN - 978-3-031-21387-8

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 108

EP - 123

BT - Product-Focused Software Process Improvement

A2 - Taibi, Davide

A2 - Kuhrmann, Marco

A2 - Mikkonen, Tommi

A2 - Abrahamsson, Pekka

A2 - Klünder, Jil

PB - Springer International Publishing AG

CY - Cham

ER -

By the same author(s)