Details
Original language | English |
---|---|
Article number | 114269 |
Number of pages | 11 |
Journal | Decision support systems |
Volume | 183 |
Early online date | 11 Jun 2024 |
Publication status | Published - Aug 2024 |
Abstract
Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.
Keywords
- ChatGPT, Corporate reporting, Natural language processing, Performance evaluation, Sustainability
ASJC Scopus subject areas
- Business, Management and Accounting(all)
- Management Information Systems
- Computer Science(all)
- Information Systems
- Psychology(all)
- Developmental and Educational Psychology
- Arts and Humanities(all)
- Arts and Humanities (miscellaneous)
- Decision Sciences(all)
- Information Systems and Management
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Decision support systems, Vol. 183, 114269, 08.2024.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Selecting textual analysis tools to classify sustainability information in corporate reporting
AU - Maibaum, Frederik
AU - Kriebel, Johannes
AU - Foege, Johann Nils
N1 - Publisher Copyright: © 2024 The Authors
PY - 2024/8
Y1 - 2024/8
N2 - Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.
AB - Information on firms' sustainability often partly resides in unstructured data published, for instance, in annual reports, news, and transcripts of earnings calls. In recent years, researchers and practitioners have started to extract information from these data sources using a broad range of natural language processing (NLP) methods. While there is much to be gained from these endeavors, studies that employ these methods rarely reflect upon the validity and quality of the chosen method—that is, how adequately NLP captures the sustainability information from text. This practice is problematic, as different NLP techniques lead to different results regarding the extraction of information. Hence, the choice of method may affect the outcome of the application and thus the inferences that users draw from their results. In this study, we examine how different types of NLP methods influence the validity and quality of extracted information. In particular, we compare four primary methods, namely (1) dictionary-based techniques, (2) topic modeling approaches, (3) word embeddings, and (4) large language models such as BERT and ChatGPT, and evaluate them on 75,000 manually labeled sentences from 10-K annual reports that serve as the ground truth. Our results show that dictionaries have a large variation in quality, topic models outperform other approaches that do not rely on large language models, and large language models show the strongest performance. In large language models, individual fine-tuning remains crucial. One-shot approaches (i.e., ChatGPT) have lately surpassed earlier approaches when using well-designed prompts and the most recent models.
KW - ChatGPT
KW - Corporate reporting
KW - Natural language processing
KW - Performance evaluation
KW - Sustainability
UR - http://www.scopus.com/inward/record.url?scp=85196207417&partnerID=8YFLogxK
U2 - 10.1016/j.dss.2024.114269
DO - 10.1016/j.dss.2024.114269
M3 - Article
AN - SCOPUS:85196207417
VL - 183
JO - Decision support systems
JF - Decision support systems
SN - 0167-9236
M1 - 114269
ER -