Zero-shot Entailment of Leaderboards for Empirical AI Research

Salomon Kabongo; Jennifer D'Souza; Sören Auer

doi:10.48550/arXiv.2303.16835

Details

Original language	English
Title of host publication	2023 ACM/IEEE Joint Conference on Digital Libraries
Subtitle of host publication	JCDL
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	237-241
Number of pages	5
ISBN (electronic)	9798350399318
ISBN (print)	979-8-3503-9932-5
Publication status	Published - 2023
Event	2023 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2023 - Santa Fe, United States Duration: 26 Jun 2023 → 30 Jun 2023

Publication series

Name	Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
Volume	2023-June
ISSN (Print)	1552-5996

Abstract

We present a large-scale empirical investigation of the zero-shot learning phenomena in a specific recognizing textual entailment (RTE) task category, i.e., the automated mining of LEADERBOARDS for Empirical AI Research. The prior reported state-of-the-art models for LEADERBOARDS extraction formulated as an RTE task in a non-zero-shot setting are promising with above 90% reported performances. However, a central research question remains unexamined: did the models actually learn entailment? Thus, for the experiments in this paper, two prior reported state-of-the-art models are tested out-of-the-box for their ability to generalize or their capacity for entailment, given LEADERBOARD labels that were unseen during training. We hypothesize that if the models learned entailment, their zero-shot performances can be expected to be moderately high as well-perhaps, concretely, better than chance. As a result of this work, a zero-shot labeled dataset is created via distant labeling, formulating the LEADERBOARD extraction RTE task.

Keywords

Entailment, Information-Extraction, Leaderboard, Natural-Language-Inference

ASJC Scopus subject areas

Engineering(all)
General Engineering

Cite this

Zero-shot Entailment of Leaderboards for Empirical AI Research. / Kabongo, Salomon; D'Souza, Jennifer; Auer, Sören.
2023 ACM/IEEE Joint Conference on Digital Libraries: JCDL. Institute of Electrical and Electronics Engineers Inc., 2023. p. 237-241 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries; Vol. 2023-June).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Kabongo, S, D'Souza, J & Auer, S 2023, Zero-shot Entailment of Leaderboards for Empirical AI Research. in 2023 ACM/IEEE Joint Conference on Digital Libraries: JCDL. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, vol. 2023-June, Institute of Electrical and Electronics Engineers Inc., pp. 237-241, 2023 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2023, Santa Fe, United States, 26 Jun 2023. https://doi.org/10.48550/arXiv.2303.16835, https://doi.org/10.1109/JCDL57899.2023.00042

Kabongo, S., D'Souza, J., & Auer, S. (2023). Zero-shot Entailment of Leaderboards for Empirical AI Research. In 2023 ACM/IEEE Joint Conference on Digital Libraries: JCDL (pp. 237-241). (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries; Vol. 2023-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2303.16835, https://doi.org/10.1109/JCDL57899.2023.00042

Kabongo S, D'Souza J, Auer S. Zero-shot Entailment of Leaderboards for Empirical AI Research. In 2023 ACM/IEEE Joint Conference on Digital Libraries: JCDL. Institute of Electrical and Electronics Engineers Inc. 2023. p. 237-241. (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). doi: 10.48550/arXiv.2303.16835, 10.1109/JCDL57899.2023.00042

Kabongo, Salomon ; D'Souza, Jennifer ; Auer, Sören. / Zero-shot Entailment of Leaderboards for Empirical AI Research. 2023 ACM/IEEE Joint Conference on Digital Libraries: JCDL. Institute of Electrical and Electronics Engineers Inc., 2023. pp. 237-241 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).

Download

@inproceedings{fbc6b51406ce4bf5967a76b408d7bf64,

title = "Zero-shot Entailment of Leaderboards for Empirical AI Research",

abstract = "We present a large-scale empirical investigation of the zero-shot learning phenomena in a specific recognizing textual entailment (RTE) task category, i.e., the automated mining of LEADERBOARDS for Empirical AI Research. The prior reported state-of-the-art models for LEADERBOARDS extraction formulated as an RTE task in a non-zero-shot setting are promising with above 90% reported performances. However, a central research question remains unexamined: did the models actually learn entailment? Thus, for the experiments in this paper, two prior reported state-of-the-art models are tested out-of-the-box for their ability to generalize or their capacity for entailment, given LEADERBOARD labels that were unseen during training. We hypothesize that if the models learned entailment, their zero-shot performances can be expected to be moderately high as well-perhaps, concretely, better than chance. As a result of this work, a zero-shot labeled dataset is created via distant labeling, formulating the LEADERBOARD extraction RTE task.",

keywords = "Entailment, Information-Extraction, Leaderboard, Natural-Language-Inference",

author = "Salomon Kabongo and Jennifer D'Souza and S{\"o}ren Auer",

note = "Funding Information: Acknowledgments. This work was co-funded by the Federal Ministry of Education and Research (BMBF) of Germany for the project LeibnizKILabor (grant no. 01DD20003), BMBF project SCINEXT (GA ID: 01lS22070), NFDI4DataScience (grant no. 460234259) and by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536).; 2023 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2023 ; Conference date: 26-06-2023 Through 30-06-2023",

year = "2023",

doi = "10.48550/arXiv.2303.16835",

language = "English",

isbn = "979-8-3503-9932-5",

series = "Proceedings of the ACM/IEEE Joint Conference on Digital Libraries",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "237--241",

booktitle = "2023 ACM/IEEE Joint Conference on Digital Libraries",

address = "United States",

}

Download

TY - GEN

T1 - Zero-shot Entailment of Leaderboards for Empirical AI Research

AU - Kabongo, Salomon

AU - D'Souza, Jennifer

AU - Auer, Sören

N1 - Funding Information: Acknowledgments. This work was co-funded by the Federal Ministry of Education and Research (BMBF) of Germany for the project LeibnizKILabor (grant no. 01DD20003), BMBF project SCINEXT (GA ID: 01lS22070), NFDI4DataScience (grant no. 460234259) and by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536).

PY - 2023

Y1 - 2023

N2 - We present a large-scale empirical investigation of the zero-shot learning phenomena in a specific recognizing textual entailment (RTE) task category, i.e., the automated mining of LEADERBOARDS for Empirical AI Research. The prior reported state-of-the-art models for LEADERBOARDS extraction formulated as an RTE task in a non-zero-shot setting are promising with above 90% reported performances. However, a central research question remains unexamined: did the models actually learn entailment? Thus, for the experiments in this paper, two prior reported state-of-the-art models are tested out-of-the-box for their ability to generalize or their capacity for entailment, given LEADERBOARD labels that were unseen during training. We hypothesize that if the models learned entailment, their zero-shot performances can be expected to be moderately high as well-perhaps, concretely, better than chance. As a result of this work, a zero-shot labeled dataset is created via distant labeling, formulating the LEADERBOARD extraction RTE task.

AB - We present a large-scale empirical investigation of the zero-shot learning phenomena in a specific recognizing textual entailment (RTE) task category, i.e., the automated mining of LEADERBOARDS for Empirical AI Research. The prior reported state-of-the-art models for LEADERBOARDS extraction formulated as an RTE task in a non-zero-shot setting are promising with above 90% reported performances. However, a central research question remains unexamined: did the models actually learn entailment? Thus, for the experiments in this paper, two prior reported state-of-the-art models are tested out-of-the-box for their ability to generalize or their capacity for entailment, given LEADERBOARD labels that were unseen during training. We hypothesize that if the models learned entailment, their zero-shot performances can be expected to be moderately high as well-perhaps, concretely, better than chance. As a result of this work, a zero-shot labeled dataset is created via distant labeling, formulating the LEADERBOARD extraction RTE task.

KW - Entailment

KW - Information-Extraction

KW - Leaderboard

KW - Natural-Language-Inference

UR - http://www.scopus.com/inward/record.url?scp=85174576379&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2303.16835

DO - 10.48550/arXiv.2303.16835

M3 - Conference contribution

AN - SCOPUS:85174576379

SN - 979-8-3503-9932-5

T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

SP - 237

EP - 241

BT - 2023 ACM/IEEE Joint Conference on Digital Libraries

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2023 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2023

Y2 - 26 June 2023 through 30 June 2023

ER -

Research@Leibniz University

Zero-shot Entailment of Leaderboards for Empirical AI Research

Authors

Research Organisations

External Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces