Selecting textual analysis tools to classify sustainability information in corporate reporting

Frederik Maibaum; Johannes Kriebel; Johann Nils Foege

doi:10.1016/j.dss.2024.114269

Details

Originalsprache	Englisch
Aufsatznummer	114269
Seitenumfang	11
Fachzeitschrift	Decision support systems
Jahrgang	183
Frühes Online-Datum	11 Juni 2024
Publikationsstatus	Veröffentlicht - Aug. 2024

Abstract

Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.

ASJC Scopus Sachgebiete

Betriebswirtschaft, Management und Rechnungswesen (insg.)
Management-Informationssysteme
Informatik (insg.)
Information systems
Psychologie (insg.)
Pädagogische und Entwicklungspsychologie
Geisteswissenschaftliche Fächer (insg.)
Geisteswissenschaftliche Fächer (sonstige)
Entscheidungswissenschaften (insg.)
Informationssysteme und -management

Zitieren

Selecting textual analysis tools to classify sustainability information in corporate reporting. / Maibaum, Frederik; Kriebel, Johannes; Foege, Johann Nils.
in: Decision support systems, Jahrgang 183, 114269, 08.2024.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

Maibaum, F, Kriebel, J & Foege, JN 2024, 'Selecting textual analysis tools to classify sustainability information in corporate reporting', Decision support systems, Jg. 183, 114269. https://doi.org/10.1016/j.dss.2024.114269

Maibaum, F., Kriebel, J., & Foege, J. N. (2024). Selecting textual analysis tools to classify sustainability information in corporate reporting. Decision support systems, 183, Artikel 114269. https://doi.org/10.1016/j.dss.2024.114269

Maibaum F, Kriebel J, Foege JN. Selecting textual analysis tools to classify sustainability information in corporate reporting. Decision support systems. 2024 Aug;183:114269. Epub 2024 Jun 11. doi: 10.1016/j.dss.2024.114269

Maibaum, Frederik ; Kriebel, Johannes ; Foege, Johann Nils. / Selecting textual analysis tools to classify sustainability information in corporate reporting. in: Decision support systems. 2024 ; Jahrgang 183.

Download

@article{4f9247bb360644ba8b45eb29ae0f56cb,

title = "Selecting textual analysis tools to classify sustainability information in corporate reporting",

abstract = "Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.",

keywords = "ChatGPT, Corporate reporting, Natural language processing, Performance evaluation, Sustainability",

author = "Frederik Maibaum and Johannes Kriebel and Foege, {Johann Nils}",

note = "Publisher Copyright: {\textcopyright} 2024 The Authors",

year = "2024",

month = aug,

doi = "10.1016/j.dss.2024.114269",

language = "English",

volume = "183",

journal = "Decision support systems",

issn = "0167-9236",

publisher = "Elsevier",

}

Download

TY - JOUR

T1 - Selecting textual analysis tools to classify sustainability information in corporate reporting

AU - Maibaum, Frederik

AU - Kriebel, Johannes

AU - Foege, Johann Nils

PY - 2024/8

Y1 - 2024/8

N2 - Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.

AB - Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.

KW - ChatGPT

KW - Corporate reporting

KW - Natural language processing

KW - Performance evaluation

KW - Sustainability

UR - http://www.scopus.com/inward/record.url?scp=85196207417&partnerID=8YFLogxK

U2 - 10.1016/j.dss.2024.114269

DO - 10.1016/j.dss.2024.114269

M3 - Article

AN - SCOPUS:85196207417

VL - 183

JO - Decision support systems

JF - Decision support systems

SN - 0167-9236

M1 - 114269

ER -

Research@Leibniz University

Selecting textual analysis tools to classify sustainability information in corporate reporting

Autoren

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren