SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph

Jennifer D’Souza; Sören Auer; Ted Pedersen

doi:10.18653/v1/2021.semeval-1.44

Details

Original language	English
Title of host publication	SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop
Editors	Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
Publisher	Association for Computational Linguistics (ACL)
Pages	364-376
Number of pages	13
ISBN (electronic)	9781954085701
Publication status	Published - 2021
Externally published	Yes
Event	15th International Workshop on Semantic Evaluation, SemEval 2021 - Virtual, Bangkok, Thailand Duration: 5 Aug 2021 → 6 Aug 2021

Publication series

Name	SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop

Abstract

There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. The SemEval-2021 Shared Task NLPCONTRIBUTIONGRAPH (a.k.a. ‘the NCG task’) tasks participants to develop automated systems that structure contributions from NLP scholarly articles in the English language. Being the first-of-its-kind in the SemEval series, the task released structured data from NLP scholarly articles at three levels of information granularity, i.e. at sentence-level, phrase-level, and phrases organized as triples toward Knowledge Graph (KG) building. The sentence-level annotations comprised the few sentences about the article’s contribution. The phrase-level annotations were scientific term and predicate phrases from the contribution sentences. Finally, the triples constituted the research overview KG. For the Shared Task, participating systems were then expected to automatically classify contribution sentences, extract scientific terms and relations from the sentences, and organize them as KG triples. Overall, the task drew a strong participation demographic of seven teams and 27 participants. The best end-to-end task system classified contribution sentences at 57.27% F1, phrases at 46.41% F1, and triples at 22.28% F1. While the absolute performance to generate triples remains low, in the conclusion of this article, the difficulty of producing such data and as a consequence of modeling it is highlighted.

ASJC Scopus subject areas

Computer Science(all)
Computational Theory and Mathematics
Computer Science(all)
Computer Science Applications
Mathematics(all)
Theoretical Computer Science

Cite this

SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph. / D’Souza, Jennifer; Auer, Sören; Pedersen, Ted.
SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop. ed. / Alexis Palmer; Nathan Schneider; Natalie Schluter; Guy Emerson; Aurelie Herbelot; Xiaodan Zhu. Association for Computational Linguistics (ACL), 2021. p. 364-376 (SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

D’Souza, J, Auer, S & Pedersen, T 2021, SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph. in A Palmer, N Schneider, N Schluter, G Emerson, A Herbelot & X Zhu (eds), SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop. SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop, Association for Computational Linguistics (ACL), pp. 364-376, 15th International Workshop on Semantic Evaluation, SemEval 2021, Virtual, Bangkok, Thailand, 5 Aug 2021. https://doi.org/10.18653/v1/2021.semeval-1.44

D’Souza, J., Auer, S., & Pedersen, T. (2021). SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph. In A. Palmer, N. Schneider, N. Schluter, G. Emerson, A. Herbelot, & X. Zhu (Eds.), SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop (pp. 364-376). (SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.semeval-1.44

D’Souza J, Auer S, Pedersen T. SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph. In Palmer A, Schneider N, Schluter N, Emerson G, Herbelot A, Zhu X, editors, SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop. Association for Computational Linguistics (ACL). 2021. p. 364-376. (SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop). doi: 10.18653/v1/2021.semeval-1.44

D’Souza, Jennifer ; Auer, Sören ; Pedersen, Ted. / SemEval-2021 Task 11 : NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph. SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop. editor / Alexis Palmer ; Nathan Schneider ; Natalie Schluter ; Guy Emerson ; Aurelie Herbelot ; Xiaodan Zhu. Association for Computational Linguistics (ACL), 2021. pp. 364-376 (SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop).

Download

@inproceedings{1cd87ff79e2f4082af24f6dfcb53c19b,

title = "SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph",

abstract = "There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. The SemEval-2021 Shared Task NLPCONTRIBUTIONGRAPH (a.k.a. {\textquoteleft}the NCG task{\textquoteright}) tasks participants to develop automated systems that structure contributions from NLP scholarly articles in the English language. Being the first-of-its-kind in the SemEval series, the task released structured data from NLP scholarly articles at three levels of information granularity, i.e. at sentence-level, phrase-level, and phrases organized as triples toward Knowledge Graph (KG) building. The sentence-level annotations comprised the few sentences about the article{\textquoteright}s contribution. The phrase-level annotations were scientific term and predicate phrases from the contribution sentences. Finally, the triples constituted the research overview KG. For the Shared Task, participating systems were then expected to automatically classify contribution sentences, extract scientific terms and relations from the sentences, and organize them as KG triples. Overall, the task drew a strong participation demographic of seven teams and 27 participants. The best end-to-end task system classified contribution sentences at 57.27% F1, phrases at 46.41% F1, and triples at 22.28% F1. While the absolute performance to generate triples remains low, in the conclusion of this article, the difficulty of producing such data and as a consequence of modeling it is highlighted.",

author = "Jennifer D{\textquoteright}Souza and S{\"o}ren Auer and Ted Pedersen",

note = "Funding Information: We thank the anonymous reviewers for their comments and suggestions. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and by the TIB Leibniz Information Centre for Science and Technology. ; 15th International Workshop on Semantic Evaluation, SemEval 2021 ; Conference date: 05-08-2021 Through 06-08-2021",

year = "2021",

doi = "10.18653/v1/2021.semeval-1.44",

language = "English",

series = "SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop",

publisher = "Association for Computational Linguistics (ACL)",

pages = "364--376",

editor = "Alexis Palmer and Nathan Schneider and Natalie Schluter and Guy Emerson and Aurelie Herbelot and Xiaodan Zhu",

booktitle = "SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop",

address = "Australia",

}

Download

TY - GEN

T1 - SemEval-2021 Task 11

T2 - 15th International Workshop on Semantic Evaluation, SemEval 2021

AU - D’Souza, Jennifer

AU - Auer, Sören

AU - Pedersen, Ted

N1 - Funding Information: We thank the anonymous reviewers for their comments and suggestions. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and by the TIB Leibniz Information Centre for Science and Technology.

PY - 2021

Y1 - 2021

N2 - There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. The SemEval-2021 Shared Task NLPCONTRIBUTIONGRAPH (a.k.a. ‘the NCG task’) tasks participants to develop automated systems that structure contributions from NLP scholarly articles in the English language. Being the first-of-its-kind in the SemEval series, the task released structured data from NLP scholarly articles at three levels of information granularity, i.e. at sentence-level, phrase-level, and phrases organized as triples toward Knowledge Graph (KG) building. The sentence-level annotations comprised the few sentences about the article’s contribution. The phrase-level annotations were scientific term and predicate phrases from the contribution sentences. Finally, the triples constituted the research overview KG. For the Shared Task, participating systems were then expected to automatically classify contribution sentences, extract scientific terms and relations from the sentences, and organize them as KG triples. Overall, the task drew a strong participation demographic of seven teams and 27 participants. The best end-to-end task system classified contribution sentences at 57.27% F1, phrases at 46.41% F1, and triples at 22.28% F1. While the absolute performance to generate triples remains low, in the conclusion of this article, the difficulty of producing such data and as a consequence of modeling it is highlighted.

AB - There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. The SemEval-2021 Shared Task NLPCONTRIBUTIONGRAPH (a.k.a. ‘the NCG task’) tasks participants to develop automated systems that structure contributions from NLP scholarly articles in the English language. Being the first-of-its-kind in the SemEval series, the task released structured data from NLP scholarly articles at three levels of information granularity, i.e. at sentence-level, phrase-level, and phrases organized as triples toward Knowledge Graph (KG) building. The sentence-level annotations comprised the few sentences about the article’s contribution. The phrase-level annotations were scientific term and predicate phrases from the contribution sentences. Finally, the triples constituted the research overview KG. For the Shared Task, participating systems were then expected to automatically classify contribution sentences, extract scientific terms and relations from the sentences, and organize them as KG triples. Overall, the task drew a strong participation demographic of seven teams and 27 participants. The best end-to-end task system classified contribution sentences at 57.27% F1, phrases at 46.41% F1, and triples at 22.28% F1. While the absolute performance to generate triples remains low, in the conclusion of this article, the difficulty of producing such data and as a consequence of modeling it is highlighted.

UR - http://www.scopus.com/inward/record.url?scp=85132941197&partnerID=8YFLogxK

U2 - 10.18653/v1/2021.semeval-1.44

DO - 10.18653/v1/2021.semeval-1.44

M3 - Conference contribution

AN - SCOPUS:85132941197

T3 - SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop

SP - 364

EP - 376

BT - SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop

A2 - Palmer, Alexis

A2 - Schneider, Nathan

A2 - Schluter, Natalie

A2 - Emerson, Guy

A2 - Herbelot, Aurelie

A2 - Zhu, Xiaodan

PB - Association for Computational Linguistics (ACL)

Y2 - 5 August 2021 through 6 August 2021

ER -

Research@Leibniz University

SemEval-2021 Task 11: NLPCONTRIBUTIONGRAPH - Structuring Scholarly NLP Contributions for a Research Knowledge Graph

Authors

External Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces