TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationJCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages5
ISBN (electronic)9781450393454
Publication statusPublished - 20 Jun 2022
Externally publishedYes
Event22nd ACM/IEEE Joint Conference on Digital Libraries, JCDL 2022 - Virtual, Online, Germany
Duration: 20 Jun 202224 Jun 2022

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Abstract

As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.

Keywords

    Crowdsourcing Microtasks, Intelligent User Interfaces, Knowledge Graph Validation, Scholarly Knowledge Graphs

ASJC Scopus subject areas

Cite this

TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation. / Oelen, Allard; Stocker, Markus; Auer, Sören.
JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022. Institute of Electrical and Electronics Engineers Inc., 2022. 5 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Oelen, A, Stocker, M & Auer, S 2022, TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation. in JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022., 5, Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Institute of Electrical and Electronics Engineers Inc., 22nd ACM/IEEE Joint Conference on Digital Libraries, JCDL 2022, Virtual, Online, Germany, 20 Jun 2022. https://doi.org/10.1145/3529372.3533285
Oelen, A., Stocker, M., & Auer, S. (2022). TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation. In JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022 Article 5 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1145/3529372.3533285
Oelen A, Stocker M, Auer S. TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation. In JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022. Institute of Electrical and Electronics Engineers Inc. 2022. 5. (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). doi: 10.1145/3529372.3533285
Oelen, Allard ; Stocker, Markus ; Auer, Sören. / TinyGenius : Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation. JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022. Institute of Electrical and Electronics Engineers Inc., 2022. (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).
Download
@inproceedings{17af10da5af44de29f5c1838d75a12ff,
title = "TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation",
abstract = "As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.",
keywords = "Crowdsourcing Microtasks, Intelligent User Interfaces, Knowledge Graph Validation, Scholarly Knowledge Graphs",
author = "Allard Oelen and Markus Stocker and S{\"o}ren Auer",
note = "Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. We would like to thank Mohamad Yaser Jaradeh and Jennifer D{\textquoteright}Souza for their contributions to this work. ; 22nd ACM/IEEE Joint Conference on Digital Libraries, JCDL 2022 ; Conference date: 20-06-2022 Through 24-06-2022",
year = "2022",
month = jun,
day = "20",
doi = "10.1145/3529372.3533285",
language = "English",
series = "Proceedings of the ACM/IEEE Joint Conference on Digital Libraries",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022",
address = "United States",

}

Download

TY - GEN

T1 - TinyGenius

T2 - 22nd ACM/IEEE Joint Conference on Digital Libraries, JCDL 2022

AU - Oelen, Allard

AU - Stocker, Markus

AU - Auer, Sören

N1 - Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. We would like to thank Mohamad Yaser Jaradeh and Jennifer D’Souza for their contributions to this work.

PY - 2022/6/20

Y1 - 2022/6/20

N2 - As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.

AB - As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.

KW - Crowdsourcing Microtasks

KW - Intelligent User Interfaces

KW - Knowledge Graph Validation

KW - Scholarly Knowledge Graphs

UR - http://www.scopus.com/inward/record.url?scp=85133225808&partnerID=8YFLogxK

U2 - 10.1145/3529372.3533285

DO - 10.1145/3529372.3533285

M3 - Conference contribution

AN - SCOPUS:85133225808

T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

BT - JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 20 June 2022 through 24 June 2022

ER -

By the same author(s)