Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Christoph Hube
  • Besnik Fetahu
  • Ujwal Gadiraju

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationCHI 2019
Subtitle of host publicationProceedings of the 2019 CHI Conference on Human Factors in Computing Systems
EditorsStephen Brewster, Geraldine Fitzpatrick, Anna Cox, Vassilis Kostakos
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
ISBN (electronic)9781450359702
Publication statusPublished - 2 May 2019
Event2019 CHI Conference on Human Factors in Computing Systems, CHI 2019 - Glasgow, United Kingdom (UK)
Duration: 4 May 20199 May 2019

Abstract

Crowdsourced data acquired from tasks that comprise a subjective component (e.g. opinion detection, sentiment analysis) is potentially affected by the inherent bias of crowd workers who contribute to the tasks. This can lead to biased and noisy ground-truth data, propagating the undesirable bias and noise when used in turn to train machine learning models or evaluate systems. In this work, we aim to understand the influence of workers’ own opinions on their performance in the subjective task of bias detection. We analyze the influence of workers’ opinions on their annotations corresponding to different topics. Our findings reveal that workers with strong opinions tend to produce biased annotations. We show that such bias can be mitigated to improve the overall quality of the data collected. Experienced crowd workers also fail to distance themselves from their own opinions to provide unbiased annotations.

ASJC Scopus subject areas

Cite this

Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. / Hube, Christoph; Fetahu, Besnik; Gadiraju, Ujwal.
CHI 2019: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ed. / Stephen Brewster; Geraldine Fitzpatrick; Anna Cox; Vassilis Kostakos. New York: Association for Computing Machinery (ACM), 2019.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Hube, C, Fetahu, B & Gadiraju, U 2019, Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. in S Brewster, G Fitzpatrick, A Cox & V Kostakos (eds), CHI 2019: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery (ACM), New York, 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019, Glasgow, United Kingdom (UK), 4 May 2019. https://doi.org/10.1145/3290605.3300637
Hube, C., Fetahu, B., & Gadiraju, U. (2019). Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In S. Brewster, G. Fitzpatrick, A. Cox, & V. Kostakos (Eds.), CHI 2019: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems Association for Computing Machinery (ACM). https://doi.org/10.1145/3290605.3300637
Hube C, Fetahu B, Gadiraju U. Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In Brewster S, Fitzpatrick G, Cox A, Kostakos V, editors, CHI 2019: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. New York: Association for Computing Machinery (ACM). 2019 doi: 10.1145/3290605.3300637
Hube, Christoph ; Fetahu, Besnik ; Gadiraju, Ujwal. / Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. CHI 2019: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. editor / Stephen Brewster ; Geraldine Fitzpatrick ; Anna Cox ; Vassilis Kostakos. New York : Association for Computing Machinery (ACM), 2019.
Download
@inproceedings{a9517df1c0fc448487912e7d1e3707cf,
title = "Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments",
abstract = "Crowdsourced data acquired from tasks that comprise a subjective component (e.g. opinion detection, sentiment analysis) is potentially affected by the inherent bias of crowd workers who contribute to the tasks. This can lead to biased and noisy ground-truth data, propagating the undesirable bias and noise when used in turn to train machine learning models or evaluate systems. In this work, we aim to understand the influence of workers{\textquoteright} own opinions on their performance in the subjective task of bias detection. We analyze the influence of workers{\textquoteright} opinions on their annotations corresponding to different topics. Our findings reveal that workers with strong opinions tend to produce biased annotations. We show that such bias can be mitigated to improve the overall quality of the data collected. Experienced crowd workers also fail to distance themselves from their own opinions to provide unbiased annotations.",
author = "Christoph Hube and Besnik Fetahu and Ujwal Gadiraju",
note = "Funding information: This work is partially supported by the ERC Advanced Grant ALEXANDRIA (grant no. 339233), DESIR (grant no. 731081), AFEL (grant no. 687916), DISKOW (grant no. 60171990) and SimpleML (grant no. 01IS18054).; 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019 ; Conference date: 04-05-2019 Through 09-05-2019",
year = "2019",
month = may,
day = "2",
doi = "10.1145/3290605.3300637",
language = "English",
editor = "Stephen Brewster and Geraldine Fitzpatrick and Anna Cox and Vassilis Kostakos",
booktitle = "CHI 2019",
publisher = "Association for Computing Machinery (ACM)",
address = "United States",

}

Download

TY - GEN

T1 - Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments

AU - Hube, Christoph

AU - Fetahu, Besnik

AU - Gadiraju, Ujwal

N1 - Funding information: This work is partially supported by the ERC Advanced Grant ALEXANDRIA (grant no. 339233), DESIR (grant no. 731081), AFEL (grant no. 687916), DISKOW (grant no. 60171990) and SimpleML (grant no. 01IS18054).

PY - 2019/5/2

Y1 - 2019/5/2

N2 - Crowdsourced data acquired from tasks that comprise a subjective component (e.g. opinion detection, sentiment analysis) is potentially affected by the inherent bias of crowd workers who contribute to the tasks. This can lead to biased and noisy ground-truth data, propagating the undesirable bias and noise when used in turn to train machine learning models or evaluate systems. In this work, we aim to understand the influence of workers’ own opinions on their performance in the subjective task of bias detection. We analyze the influence of workers’ opinions on their annotations corresponding to different topics. Our findings reveal that workers with strong opinions tend to produce biased annotations. We show that such bias can be mitigated to improve the overall quality of the data collected. Experienced crowd workers also fail to distance themselves from their own opinions to provide unbiased annotations.

AB - Crowdsourced data acquired from tasks that comprise a subjective component (e.g. opinion detection, sentiment analysis) is potentially affected by the inherent bias of crowd workers who contribute to the tasks. This can lead to biased and noisy ground-truth data, propagating the undesirable bias and noise when used in turn to train machine learning models or evaluate systems. In this work, we aim to understand the influence of workers’ own opinions on their performance in the subjective task of bias detection. We analyze the influence of workers’ opinions on their annotations corresponding to different topics. Our findings reveal that workers with strong opinions tend to produce biased annotations. We show that such bias can be mitigated to improve the overall quality of the data collected. Experienced crowd workers also fail to distance themselves from their own opinions to provide unbiased annotations.

UR - http://www.scopus.com/inward/record.url?scp=85066894365&partnerID=8YFLogxK

U2 - 10.1145/3290605.3300637

DO - 10.1145/3290605.3300637

M3 - Conference contribution

AN - SCOPUS:85066894365

BT - CHI 2019

A2 - Brewster, Stephen

A2 - Fitzpatrick, Geraldine

A2 - Cox, Anna

A2 - Kostakos, Vassilis

PB - Association for Computing Machinery (ACM)

CY - New York

T2 - 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019

Y2 - 4 May 2019 through 9 May 2019

ER -