Argument Quality Assessment in the Age of Instruction-Following Large Language Models

Henning Wachsmuth; Gabriella Lapesa; Elena Cabrio; Anne Lauscher; Joonsuk Park; Eva Maria Vecchi; Serena Villata; Timon Ziegenbein

doi:10.48550/arXiv.2403.16084

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Herausgeber/-innen	Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Seiten	1519-1538
Publikationsstatus	Veröffentlicht - Mai 2024
Veranstaltung	Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italien Dauer: 20 Mai 2024 → 25 Mai 2024

Abstract

The computational treatment of arguments on controversial issues has been subject to extensive NLP research, due to its envisioned impact on opinion formation, decision making, writing education, and the like. A critical task in any such application is the assessment of an argument’s quality - but it is also particularly challenging. In this position paper, we start from a brief survey of argument quality research, where we identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment. We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment. Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve argument-related problems. We discuss the real-world opportunities and ethical issues emerging thereby.

Zitieren

Argument Quality Assessment in the Age of Instruction-Following Large Language Models. / Wachsmuth, Henning; Lapesa, Gabriella; Cabrio, Elena et al.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Hrsg. / Nicoletta Calzolari; Min-Yen Kan; Veronique Hoste; Alessandro Lenci; Sakriani Sakti; Nianwen Xue. 2024. S. 1519-1538.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Wachsmuth, H, Lapesa, G, Cabrio, E, Lauscher, A, Park, J, Vecchi, EM, Villata, S & Ziegenbein, T 2024, Argument Quality Assessment in the Age of Instruction-Following Large Language Models. in N Calzolari, M-Y Kan, V Hoste, A Lenci, S Sakti & N Xue (Hrsg.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). S. 1519-1538, Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024, Hybrid, Torino, Italien, 20 Mai 2024. https://doi.org/10.48550/arXiv.2403.16084

Wachsmuth, H., Lapesa, G., Cabrio, E., Lauscher, A., Park, J., Vecchi, E. M., Villata, S., & Ziegenbein, T. (2024). Argument Quality Assessment in the Age of Instruction-Following Large Language Models. In N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti, & N. Xue (Hrsg.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (S. 1519-1538) https://doi.org/10.48550/arXiv.2403.16084

Wachsmuth H, Lapesa G, Cabrio E, Lauscher A, Park J, Vecchi EM et al. Argument Quality Assessment in the Age of Instruction-Following Large Language Models. in Calzolari N, Kan MY, Hoste V, Lenci A, Sakti S, Xue N, Hrsg., Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024. S. 1519-1538 doi: 10.48550/arXiv.2403.16084

Wachsmuth, Henning ; Lapesa, Gabriella ; Cabrio, Elena et al. / Argument Quality Assessment in the Age of Instruction-Following Large Language Models. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Hrsg. / Nicoletta Calzolari ; Min-Yen Kan ; Veronique Hoste ; Alessandro Lenci ; Sakriani Sakti ; Nianwen Xue. 2024. S. 1519-1538

Download

@inproceedings{b60225532de64dab98cf8ccc39f26b6d,

title = "Argument Quality Assessment in the Age of Instruction-Following Large Language Models",

abstract = "The computational treatment of arguments on controversial issues has been subject to extensive NLP research, due to its envisioned impact on opinion formation, decision making, writing education, and the like. A critical task in any such application is the assessment of an argument{\textquoteright}s quality - but it is also particularly challenging. In this position paper, we start from a brief survey of argument quality research, where we identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment. We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment. Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve argument-related problems. We discuss the real-world opportunities and ethical issues emerging thereby.",

author = "Henning Wachsmuth and Gabriella Lapesa and Elena Cabrio and Anne Lauscher and Joonsuk Park and Vecchi, {Eva Maria} and Serena Villata and Timon Ziegenbein",

note = "{\textcopyright} 2024 ELRA Language Resource Association; Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 ; Conference date: 20-05-2024 Through 25-05-2024",

year = "2024",

month = may,

doi = "10.48550/arXiv.2403.16084",

language = "English",

pages = "1519--1538",

editor = "Nicoletta Calzolari and Min-Yen Kan and Veronique Hoste and Alessandro Lenci and Sakriani Sakti and Nianwen Xue",

booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",

}

Download

TY - GEN

T1 - Argument Quality Assessment in the Age of Instruction-Following Large Language Models

AU - Wachsmuth, Henning

AU - Lapesa, Gabriella

AU - Cabrio, Elena

AU - Lauscher, Anne

AU - Park, Joonsuk

AU - Vecchi, Eva Maria

AU - Villata, Serena

AU - Ziegenbein, Timon

PY - 2024/5

Y1 - 2024/5

N2 - The computational treatment of arguments on controversial issues has been subject to extensive NLP research, due to its envisioned impact on opinion formation, decision making, writing education, and the like. A critical task in any such application is the assessment of an argument’s quality - but it is also particularly challenging. In this position paper, we start from a brief survey of argument quality research, where we identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment. We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment. Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve argument-related problems. We discuss the real-world opportunities and ethical issues emerging thereby.

AB - The computational treatment of arguments on controversial issues has been subject to extensive NLP research, due to its envisioned impact on opinion formation, decision making, writing education, and the like. A critical task in any such application is the assessment of an argument’s quality - but it is also particularly challenging. In this position paper, we start from a brief survey of argument quality research, where we identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment. We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment. Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve argument-related problems. We discuss the real-world opportunities and ethical issues emerging thereby.

U2 - 10.48550/arXiv.2403.16084

DO - 10.48550/arXiv.2403.16084

M3 - Conference contribution

SP - 1519

EP - 1538

BT - Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

A2 - Calzolari, Nicoletta

A2 - Kan, Min-Yen

A2 - Hoste, Veronique

A2 - Lenci, Alessandro

A2 - Sakti, Sakriani

A2 - Xue, Nianwen

T2 - Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024

Y2 - 20 May 2024 through 25 May 2024

ER -

Research@Leibniz University

Argument Quality Assessment in the Age of Instruction-Following Large Language Models

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

Zitieren

Von denselben Autoren

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

Mehrebenenannotation argumentativer Lerner∗innentexte für die automatische Textauswertung

When to use a metaphor: Metaphors in dialogical explanations with addressees of different expertise

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

Mehrebenenannotation argumentativer Lerner∗innentexte für die automatische Textauswertung

When to use a metaphor: Metaphors in dialogical explanations with addressees of different expertise

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models