When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning

Mihai Georgescu; Dang Duc Pham; Claudiu S. Firan; Ujwal Gadiraju; Wolfgang Nejdl

doi:10.1145/2611040.2611047

Details

Originalsprache	Englisch
Titel des Sammelwerks	4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014
Herausgeber (Verlag)	Association for Computing Machinery (ACM)
ISBN (Print)	9781450325387
Publikationsstatus	Veröffentlicht - 2 Juni 2014
Veranstaltung	4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014 - Thessaloniki, Griechenland Dauer: 2 Juni 2014 → 4 Juni 2014

Publikationsreihe

Name	ACM International Conference Proceeding Series

Abstract

Crowdsourcing has become ubiquitous in machine learning as a cost effective method to gather training labels. In this paper we examine the challenges that appear when employing crowdsourcing for active learning, in an integrated environment where an automatic method and human labelers work together towards improving their performance at a certain task. By using Active Learning techniques on crowd-labeled data, we optimize the performance of the automatic method towards better accuracy, while keeping the costs low by gathering data on demand. In order to verify our proposed methods, we apply them to the task of deduplication of publications in a digital library by examining metadata. We investigate the problems created by noisy labels produced by the crowd and explore methods to aggregate them. We analyze how different automatic methods are affected by the quantity and quality of the allocated resources as well as the instance selection strategies for each active learning round, aiming towards attaining a balance between cost and performance.

ASJC Scopus Sachgebiete

Informatik (insg.)
Software
Informatik (insg.)
Mensch-Maschine-Interaktion
Informatik (insg.)
Maschinelles Sehen und Mustererkennung
Informatik (insg.)
Computernetzwerke und -kommunikation

Zitieren

When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning. / Georgescu, Mihai; Pham, Dang Duc; Firan, Claudiu S. et al.
4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014. Association for Computing Machinery (ACM), 2014. (ACM International Conference Proceeding Series).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Georgescu, M, Pham, DD, Firan, CS, Gadiraju, U & Nejdl, W 2014, When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning. in 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014. ACM International Conference Proceeding Series, Association for Computing Machinery (ACM), 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014, Thessaloniki, Griechenland, 2 Juni 2014. https://doi.org/10.1145/2611040.2611047

Georgescu, M., Pham, D. D., Firan, C. S., Gadiraju, U., & Nejdl, W. (2014). When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning. In 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014 (ACM International Conference Proceeding Series). Association for Computing Machinery (ACM). https://doi.org/10.1145/2611040.2611047

Georgescu M, Pham DD, Firan CS, Gadiraju U, Nejdl W. When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning. in 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014. Association for Computing Machinery (ACM). 2014. (ACM International Conference Proceeding Series). doi: 10.1145/2611040.2611047

Georgescu, Mihai ; Pham, Dang Duc ; Firan, Claudiu S. et al. / When in Doubt Ask the Crowd : Employing Crowdsourcing for Active Learning. 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014. Association for Computing Machinery (ACM), 2014. (ACM International Conference Proceeding Series).

Download

@inproceedings{b9e3c60a933b4035b391b8e86189b2b8,

title = "When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning",

abstract = "Crowdsourcing has become ubiquitous in machine learning as a cost effective method to gather training labels. In this paper we examine the challenges that appear when employing crowdsourcing for active learning, in an integrated environment where an automatic method and human labelers work together towards improving their performance at a certain task. By using Active Learning techniques on crowd-labeled data, we optimize the performance of the automatic method towards better accuracy, while keeping the costs low by gathering data on demand. In order to verify our proposed methods, we apply them to the task of deduplication of publications in a digital library by examining metadata. We investigate the problems created by noisy labels produced by the crowd and explore methods to aggregate them. We analyze how different automatic methods are affected by the quantity and quality of the allocated resources as well as the instance selection strategies for each active learning round, aiming towards attaining a balance between cost and performance.",

keywords = "Active Learning, Crowdsourcing, Human Computation, Machine Learning",

author = "Mihai Georgescu and Pham, {Dang Duc} and Firan, {Claudiu S.} and Ujwal Gadiraju and Wolfgang Nejdl",

year = "2014",

month = jun,

day = "2",

doi = "10.1145/2611040.2611047",

language = "English",

isbn = "9781450325387",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery (ACM)",

booktitle = "4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014",

address = "United States",

note = "4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014 ; Conference date: 02-06-2014 Through 04-06-2014",

}

Download

TY - GEN

T1 - When in Doubt Ask the Crowd

T2 - 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014

AU - Georgescu, Mihai

AU - Pham, Dang Duc

AU - Firan, Claudiu S.

AU - Gadiraju, Ujwal

AU - Nejdl, Wolfgang

PY - 2014/6/2

Y1 - 2014/6/2

N2 - Crowdsourcing has become ubiquitous in machine learning as a cost effective method to gather training labels. In this paper we examine the challenges that appear when employing crowdsourcing for active learning, in an integrated environment where an automatic method and human labelers work together towards improving their performance at a certain task. By using Active Learning techniques on crowd-labeled data, we optimize the performance of the automatic method towards better accuracy, while keeping the costs low by gathering data on demand. In order to verify our proposed methods, we apply them to the task of deduplication of publications in a digital library by examining metadata. We investigate the problems created by noisy labels produced by the crowd and explore methods to aggregate them. We analyze how different automatic methods are affected by the quantity and quality of the allocated resources as well as the instance selection strategies for each active learning round, aiming towards attaining a balance between cost and performance.

AB - Crowdsourcing has become ubiquitous in machine learning as a cost effective method to gather training labels. In this paper we examine the challenges that appear when employing crowdsourcing for active learning, in an integrated environment where an automatic method and human labelers work together towards improving their performance at a certain task. By using Active Learning techniques on crowd-labeled data, we optimize the performance of the automatic method towards better accuracy, while keeping the costs low by gathering data on demand. In order to verify our proposed methods, we apply them to the task of deduplication of publications in a digital library by examining metadata. We investigate the problems created by noisy labels produced by the crowd and explore methods to aggregate them. We analyze how different automatic methods are affected by the quantity and quality of the allocated resources as well as the instance selection strategies for each active learning round, aiming towards attaining a balance between cost and performance.

KW - Active Learning

KW - Crowdsourcing

KW - Human Computation

KW - Machine Learning

UR - http://www.scopus.com/inward/record.url?scp=84903649754&partnerID=8YFLogxK

U2 - 10.1145/2611040.2611047

DO - 10.1145/2611040.2611047

M3 - Conference contribution

AN - SCOPUS:84903649754

SN - 9781450325387

T3 - ACM International Conference Proceeding Series

BT - 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014

PB - Association for Computing Machinery (ACM)

Y2 - 2 June 2014 through 4 June 2014

ER -

Research@Leibniz University

When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning

Autorschaft

Organisationseinheiten

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Adaptive Dispatching of Mobile Charging Stations using Multi-Agent Graph Convolutional Cooperative-Competitive Reinforcement Learning

Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets

Open benchmark for filtering techniques in entity resolution

Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions