Loading [MathJax]/extensions/tex2jax.js

Data Drift in Clinical Outcome Prediction from Admission Notes

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Paul Grundmann
  • Jens Michalis Papaioannou
  • Tom Oberhauser
  • Thomas Steffek
  • Wolfgang Nejdl

Organisationseinheiten

Externe Organisationen

  • Berliner Hochschule für Technik (BHT)

Details

OriginalspracheEnglisch
Titel des Sammelwerks2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024
Untertitel Main Conference Proceedings
Herausgeber/-innenNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Seiten4381-4391
Seitenumfang11
ISBN (elektronisch)9782493814104
PublikationsstatusVeröffentlicht - 20 Mai 2024
VeranstaltungJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italien
Dauer: 20 Mai 202425 Mai 2024

Abstract

Clinical NLP research faces a scarcity of publicly available datasets due to privacy concerns. MIMIC-III marked a significant milestone, enabling substantial progress, and now, with MIMIC-IV, the dataset has expanded significantly, offering a broader scope. In this paper, we focus on the task of predicting clinical outcomes from clinical text. This is crucial in modern healthcare, aiding in preventive care, differential diagnosis, and capacity planning. We introduce a novel clinical outcome prediction dataset derived from MIMIC-IV. Furthermore, we provide initial insights into the performance of models trained on MIMIC-III when applied to our new dataset, with specific attention to potential data drift. We investigate challenges tied to evolving documentation standards and changing codes in the International Classification of Diseases (ICD) taxonomy, such as the transition from ICD-9 to ICD-10. We also explore variations in clinical text across different hospital wards. Our study aims to probe the robustness and generalization of clinical outcome prediction models, contributing to the ongoing advancement of clinical NLP in healthcare.

ASJC Scopus Sachgebiete

Zitieren

Data Drift in Clinical Outcome Prediction from Admission Notes. / Grundmann, Paul; Papaioannou, Jens Michalis; Oberhauser, Tom et al.
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 : Main Conference Proceedings. Hrsg. / Nicoletta Calzolari; Min-Yen Kan; Veronique Hoste; Alessandro Lenci; Sakriani Sakti; Nianwen Xue. 2024. S. 4381-4391.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Grundmann, P, Papaioannou, JM, Oberhauser, T, Steffek, T, Siu, A, Nejdl, W & Löser, A 2024, Data Drift in Clinical Outcome Prediction from Admission Notes. in N Calzolari, M-Y Kan, V Hoste, A Lenci, S Sakti & N Xue (Hrsg.), 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 : Main Conference Proceedings. S. 4381-4391, Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024, Hybrid, Torino, Italien, 20 Mai 2024. <https://aclanthology.org/2024.lrec-main.391/>
Grundmann, P., Papaioannou, J. M., Oberhauser, T., Steffek, T., Siu, A., Nejdl, W., & Löser, A. (2024). Data Drift in Clinical Outcome Prediction from Admission Notes. In N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti, & N. Xue (Hrsg.), 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 : Main Conference Proceedings (S. 4381-4391) https://aclanthology.org/2024.lrec-main.391/
Grundmann P, Papaioannou JM, Oberhauser T, Steffek T, Siu A, Nejdl W et al. Data Drift in Clinical Outcome Prediction from Admission Notes. in Calzolari N, Kan MY, Hoste V, Lenci A, Sakti S, Xue N, Hrsg., 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 : Main Conference Proceedings. 2024. S. 4381-4391
Grundmann, Paul ; Papaioannou, Jens Michalis ; Oberhauser, Tom et al. / Data Drift in Clinical Outcome Prediction from Admission Notes. 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 : Main Conference Proceedings. Hrsg. / Nicoletta Calzolari ; Min-Yen Kan ; Veronique Hoste ; Alessandro Lenci ; Sakriani Sakti ; Nianwen Xue. 2024. S. 4381-4391
Download
@inproceedings{971dabae3c764b9c8d14602dff1a2e7a,
title = "Data Drift in Clinical Outcome Prediction from Admission Notes",
abstract = "Clinical NLP research faces a scarcity of publicly available datasets due to privacy concerns. MIMIC-III marked a significant milestone, enabling substantial progress, and now, with MIMIC-IV, the dataset has expanded significantly, offering a broader scope. In this paper, we focus on the task of predicting clinical outcomes from clinical text. This is crucial in modern healthcare, aiding in preventive care, differential diagnosis, and capacity planning. We introduce a novel clinical outcome prediction dataset derived from MIMIC-IV. Furthermore, we provide initial insights into the performance of models trained on MIMIC-III when applied to our new dataset, with specific attention to potential data drift. We investigate challenges tied to evolving documentation standards and changing codes in the International Classification of Diseases (ICD) taxonomy, such as the transition from ICD-9 to ICD-10. We also explore variations in clinical text across different hospital wards. Our study aims to probe the robustness and generalization of clinical outcome prediction models, contributing to the ongoing advancement of clinical NLP in healthcare.",
keywords = "Corpus (Creation, Annotation, etc.), Document Classification, Neural language representation models, Text categorisation",
author = "Paul Grundmann and Papaioannou, {Jens Michalis} and Tom Oberhauser and Thomas Steffek and Amy Siu and Wolfgang Nejdl and Alexander L{\"o}ser",
note = "Publisher Copyright: {\textcopyright} 2024 ELRA Language Resource Association: CC BY-NC 4.0.; Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 ; Conference date: 20-05-2024 Through 25-05-2024",
year = "2024",
month = may,
day = "20",
language = "English",
pages = "4381--4391",
editor = "Nicoletta Calzolari and Min-Yen Kan and Veronique Hoste and Alessandro Lenci and Sakriani Sakti and Nianwen Xue",
booktitle = "2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024",

}

Download

TY - GEN

T1 - Data Drift in Clinical Outcome Prediction from Admission Notes

AU - Grundmann, Paul

AU - Papaioannou, Jens Michalis

AU - Oberhauser, Tom

AU - Steffek, Thomas

AU - Siu, Amy

AU - Nejdl, Wolfgang

AU - Löser, Alexander

N1 - Publisher Copyright: © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

PY - 2024/5/20

Y1 - 2024/5/20

N2 - Clinical NLP research faces a scarcity of publicly available datasets due to privacy concerns. MIMIC-III marked a significant milestone, enabling substantial progress, and now, with MIMIC-IV, the dataset has expanded significantly, offering a broader scope. In this paper, we focus on the task of predicting clinical outcomes from clinical text. This is crucial in modern healthcare, aiding in preventive care, differential diagnosis, and capacity planning. We introduce a novel clinical outcome prediction dataset derived from MIMIC-IV. Furthermore, we provide initial insights into the performance of models trained on MIMIC-III when applied to our new dataset, with specific attention to potential data drift. We investigate challenges tied to evolving documentation standards and changing codes in the International Classification of Diseases (ICD) taxonomy, such as the transition from ICD-9 to ICD-10. We also explore variations in clinical text across different hospital wards. Our study aims to probe the robustness and generalization of clinical outcome prediction models, contributing to the ongoing advancement of clinical NLP in healthcare.

AB - Clinical NLP research faces a scarcity of publicly available datasets due to privacy concerns. MIMIC-III marked a significant milestone, enabling substantial progress, and now, with MIMIC-IV, the dataset has expanded significantly, offering a broader scope. In this paper, we focus on the task of predicting clinical outcomes from clinical text. This is crucial in modern healthcare, aiding in preventive care, differential diagnosis, and capacity planning. We introduce a novel clinical outcome prediction dataset derived from MIMIC-IV. Furthermore, we provide initial insights into the performance of models trained on MIMIC-III when applied to our new dataset, with specific attention to potential data drift. We investigate challenges tied to evolving documentation standards and changing codes in the International Classification of Diseases (ICD) taxonomy, such as the transition from ICD-9 to ICD-10. We also explore variations in clinical text across different hospital wards. Our study aims to probe the robustness and generalization of clinical outcome prediction models, contributing to the ongoing advancement of clinical NLP in healthcare.

KW - Corpus (Creation, Annotation, etc.)

KW - Document Classification

KW - Neural language representation models

KW - Text categorisation

UR - http://www.scopus.com/inward/record.url?scp=85195942831&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85195942831

SP - 4381

EP - 4391

BT - 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024

A2 - Calzolari, Nicoletta

A2 - Kan, Min-Yen

A2 - Hoste, Veronique

A2 - Lenci, Alessandro

A2 - Sakti, Sakriani

A2 - Xue, Nianwen

T2 - Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024

Y2 - 20 May 2024 through 25 May 2024

ER -

Von denselben Autoren