Automated Mining of Leaderboards for Empirical AI Research

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationTowards Open and Trustworthy Digital Societies
Subtitle of host publication23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings
EditorsHao-Ren Ke, Chei Sian Lee, Kazunari Sugiyama
PublisherSpringer Nature Switzerland AG
Pages453-470
Number of pages18
ISBN (electronic)978-3-030-91669-5
ISBN (print)9783030916688
Publication statusPublished - 2021
Event23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021 - Virtual, Online
Duration: 1 Dec 20213 Dec 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13133
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

With the rapid growth of research publications, empowering scientists to keep an oversight over scientific progress is of paramount importance. In this regard, the leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge. Crowdsourcing efforts like PapersWithCode among others are devoted to the construction of leaderboards predominantly for various subdomains in Artificial Intelligence. Leaderboards provide machine-readable scholarly knowledge that has proven to be directly useful for scientists to keep track of research progress – their construction could be greatly expedited with automated text mining. This study presents a comprehensive approach for generating leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our analysis reveals an optimal approach that significantly outperforms existing baselines for the task with evaluation scores above 90% in F1. This, in turn, offers new state-of-the-art results for leaderboard extraction. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.

Keywords

    Information extraction, Knowledge graphs, Neural machine learning, Scholarly text mining, Table mining

ASJC Scopus subject areas

Cite this

Automated Mining of Leaderboards for Empirical AI Research. / Kabongo, Salomon; D’Souza, Jennifer; Auer, Sören.
Towards Open and Trustworthy Digital Societies: 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings. ed. / Hao-Ren Ke; Chei Sian Lee; Kazunari Sugiyama. Springer Nature Switzerland AG, 2021. p. 453-470 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13133).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Kabongo, S, D’Souza, J & Auer, S 2021, Automated Mining of Leaderboards for Empirical AI Research. in H-R Ke, CS Lee & K Sugiyama (eds), Towards Open and Trustworthy Digital Societies: 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13133, Springer Nature Switzerland AG, pp. 453-470, 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Virtual, Online, 1 Dec 2021. https://doi.org/10.1007/978-3-030-91669-5_35
Kabongo, S., D’Souza, J., & Auer, S. (2021). Automated Mining of Leaderboards for Empirical AI Research. In H.-R. Ke, C. S. Lee, & K. Sugiyama (Eds.), Towards Open and Trustworthy Digital Societies: 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings (pp. 453-470). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13133). Springer Nature Switzerland AG. https://doi.org/10.1007/978-3-030-91669-5_35
Kabongo S, D’Souza J, Auer S. Automated Mining of Leaderboards for Empirical AI Research. In Ke HR, Lee CS, Sugiyama K, editors, Towards Open and Trustworthy Digital Societies: 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings. Springer Nature Switzerland AG. 2021. p. 453-470. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2021 Nov 30. doi: 10.1007/978-3-030-91669-5_35
Kabongo, Salomon ; D’Souza, Jennifer ; Auer, Sören. / Automated Mining of Leaderboards for Empirical AI Research. Towards Open and Trustworthy Digital Societies: 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings. editor / Hao-Ren Ke ; Chei Sian Lee ; Kazunari Sugiyama. Springer Nature Switzerland AG, 2021. pp. 453-470 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{065f5bac741f4214a36a6fc4397a0884,
title = "Automated Mining of Leaderboards for Empirical AI Research",
abstract = "With the rapid growth of research publications, empowering scientists to keep an oversight over scientific progress is of paramount importance. In this regard, the leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge. Crowdsourcing efforts like PapersWithCode among others are devoted to the construction of leaderboards predominantly for various subdomains in Artificial Intelligence. Leaderboards provide machine-readable scholarly knowledge that has proven to be directly useful for scientists to keep track of research progress – their construction could be greatly expedited with automated text mining. This study presents a comprehensive approach for generating leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our analysis reveals an optimal approach that significantly outperforms existing baselines for the task with evaluation scores above 90% in F1. This, in turn, offers new state-of-the-art results for leaderboard extraction. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.",
keywords = "Information extraction, Knowledge graphs, Neural machine learning, Scholarly text mining, Table mining",
author = "Salomon Kabongo and Jennifer D{\textquoteright}Souza and S{\"o}ren Auer",
note = "Funding Information: This work was co-funded by the Federal Ministry of Education and Research (BMBF) of Germany for the project LeibnizKILabor (grant no. 01DD20003) and by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536).; 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021 ; Conference date: 01-12-2021 Through 03-12-2021",
year = "2021",
doi = "10.1007/978-3-030-91669-5_35",
language = "English",
isbn = "9783030916688",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Nature Switzerland AG",
pages = "453--470",
editor = "Hao-Ren Ke and Lee, {Chei Sian} and Kazunari Sugiyama",
booktitle = "Towards Open and Trustworthy Digital Societies",
address = "Switzerland",

}

Download

TY - GEN

T1 - Automated Mining of Leaderboards for Empirical AI Research

AU - Kabongo, Salomon

AU - D’Souza, Jennifer

AU - Auer, Sören

N1 - Funding Information: This work was co-funded by the Federal Ministry of Education and Research (BMBF) of Germany for the project LeibnizKILabor (grant no. 01DD20003) and by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536).

PY - 2021

Y1 - 2021

N2 - With the rapid growth of research publications, empowering scientists to keep an oversight over scientific progress is of paramount importance. In this regard, the leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge. Crowdsourcing efforts like PapersWithCode among others are devoted to the construction of leaderboards predominantly for various subdomains in Artificial Intelligence. Leaderboards provide machine-readable scholarly knowledge that has proven to be directly useful for scientists to keep track of research progress – their construction could be greatly expedited with automated text mining. This study presents a comprehensive approach for generating leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our analysis reveals an optimal approach that significantly outperforms existing baselines for the task with evaluation scores above 90% in F1. This, in turn, offers new state-of-the-art results for leaderboard extraction. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.

AB - With the rapid growth of research publications, empowering scientists to keep an oversight over scientific progress is of paramount importance. In this regard, the leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge. Crowdsourcing efforts like PapersWithCode among others are devoted to the construction of leaderboards predominantly for various subdomains in Artificial Intelligence. Leaderboards provide machine-readable scholarly knowledge that has proven to be directly useful for scientists to keep track of research progress – their construction could be greatly expedited with automated text mining. This study presents a comprehensive approach for generating leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our analysis reveals an optimal approach that significantly outperforms existing baselines for the task with evaluation scores above 90% in F1. This, in turn, offers new state-of-the-art results for leaderboard extraction. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.

KW - Information extraction

KW - Knowledge graphs

KW - Neural machine learning

KW - Scholarly text mining

KW - Table mining

UR - http://www.scopus.com/inward/record.url?scp=85121928250&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-91669-5_35

DO - 10.1007/978-3-030-91669-5_35

M3 - Conference contribution

AN - SCOPUS:85121928250

SN - 9783030916688

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 453

EP - 470

BT - Towards Open and Trustworthy Digital Societies

A2 - Ke, Hao-Ren

A2 - Lee, Chei Sian

A2 - Sugiyama, Kazunari

PB - Springer Nature Switzerland AG

T2 - 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021

Y2 - 1 December 2021 through 3 December 2021

ER -

By the same author(s)