Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study

Salomon Kabongo; Jennifer D’Souza; Sören Auer

doi:10.1007/978-3-031-70242-6_15

Details

Originalsprache	Englisch
Titel des Sammelwerks	Natural Language Processing and Information Systems
Untertitel	29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Proceedings
Herausgeber/-innen	Amon Rapp, Luigi Di Caro, Farid Meziane, Vijayan Sugumaran
Herausgeber (Verlag)	Springer Science and Business Media Deutschland GmbH
Seiten	150-160
Seitenumfang	11
ISBN (elektronisch)	978-3-031-70242-6
ISBN (Print)	9783031702419
Publikationsstatus	Veröffentlicht - 20 Sept. 2024
Veranstaltung	29th International Conference on Natural Language and Information Systems, NLDB 2024 - Turin, Italien Dauer: 25 Juni 2024 → 27 Juni 2024

Publikationsreihe

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band	14763 LNCS
ISSN (Print)	0302-9743
ISSN (elektronisch)	1611-3349

Abstract

This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the state of the art in leaderboard generation but also sheds light on strategies to mitigate common challenges in LLM-based information extraction.

ASJC Scopus Sachgebiete

Mathematik (insg.)
Theoretische Informatik
Informatik (insg.)
Allgemeine Computerwissenschaft

Zitieren

Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study. / Kabongo, Salomon; D’Souza, Jennifer; Auer, Sören.
Natural Language Processing and Information Systems : 29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Proceedings. Hrsg. / Amon Rapp; Luigi Di Caro; Farid Meziane; Vijayan Sugumaran. Springer Science and Business Media Deutschland GmbH, 2024. S. 150-160 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 14763 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Kabongo, S, D’Souza, J & Auer, S 2024, Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study. in A Rapp, L Di Caro, F Meziane & V Sugumaran (Hrsg.), Natural Language Processing and Information Systems : 29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Bd. 14763 LNCS, Springer Science and Business Media Deutschland GmbH, S. 150-160, 29th International Conference on Natural Language and Information Systems, NLDB 2024, Turin, Italien, 25 Juni 2024. https://doi.org/10.1007/978-3-031-70242-6_15

Kabongo, S., D’Souza, J., & Auer, S. (2024). Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study. In A. Rapp, L. Di Caro, F. Meziane, & V. Sugumaran (Hrsg.), Natural Language Processing and Information Systems : 29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Proceedings (S. 150-160). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 14763 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-70242-6_15

Kabongo S, D’Souza J, Auer S. Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study. in Rapp A, Di Caro L, Meziane F, Sugumaran V, Hrsg., Natural Language Processing and Information Systems : 29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Proceedings. Springer Science and Business Media Deutschland GmbH. 2024. S. 150-160. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-70242-6_15

Kabongo, Salomon ; D’Souza, Jennifer ; Auer, Sören. / Effective Context Selection in LLM-Based Leaderboard Generation : An Empirical Study. Natural Language Processing and Information Systems : 29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Proceedings. Hrsg. / Amon Rapp ; Luigi Di Caro ; Farid Meziane ; Vijayan Sugumaran. Springer Science and Business Media Deutschland GmbH, 2024. S. 150-160 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{2ade8ea55580423cb8c7f6de5411d9fb,

title = "Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study",

abstract = "This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the state of the art in leaderboard generation but also sheds light on strategies to mitigate common challenges in LLM-based information extraction.",

author = "Salomon Kabongo and Jennifer D{\textquoteright}Souza and S{\"o}ren Auer",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.; 29th International Conference on Natural Language and Information Systems, NLDB 2024 ; Conference date: 25-06-2024 Through 27-06-2024",

year = "2024",

month = sep,

day = "20",

doi = "10.1007/978-3-031-70242-6_15",

language = "English",

isbn = "9783031702419",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "150--160",

editor = "Amon Rapp and {Di Caro}, Luigi and Farid Meziane and Vijayan Sugumaran",

booktitle = "Natural Language Processing and Information Systems",

address = "Germany",

}

Download

TY - GEN

T1 - Effective Context Selection in LLM-Based Leaderboard Generation

T2 - 29th International Conference on Natural Language and Information Systems, NLDB 2024

AU - Kabongo, Salomon

AU - D’Souza, Jennifer

AU - Auer, Sören

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

PY - 2024/9/20

Y1 - 2024/9/20

N2 - This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the state of the art in leaderboard generation but also sheds light on strategies to mitigate common challenges in LLM-based information extraction.

AB - This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the state of the art in leaderboard generation but also sheds light on strategies to mitigate common challenges in LLM-based information extraction.

UR - http://www.scopus.com/inward/record.url?scp=85205459491&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-70242-6_15

DO - 10.1007/978-3-031-70242-6_15

M3 - Conference contribution

AN - SCOPUS:85205459491

SN - 9783031702419

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 150

EP - 160

BT - Natural Language Processing and Information Systems

A2 - Rapp, Amon

A2 - Di Caro, Luigi

A2 - Meziane, Farid

A2 - Sugumaran, Vijayan

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 25 June 2024 through 27 June 2024

ER -

Research@Leibniz University

Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces