Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

  • Ali Rasekh
  • Reza Heidari
  • Amir Hosein Haji Mohammad Rezaie
  • Parsa Sharifi Sedeh
  • Zahra Ahmadi
  • Prasenjit Mitra
  • Wolfgang Nejdl

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)174107-174121
Seitenumfang15
FachzeitschriftIEEE ACCESS
Jahrgang12
PublikationsstatusVeröffentlicht - 13 Nov. 2024

Abstract

With the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenotyping where using different modalities of data could significantly improve our ability to predict. To tackle this challenge, we introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information. Apart from the technical challenges, our goal is to make the predictive model more robust in noisy conditions and perform better than current methods. We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results while simultaneously providing a principled means of modeling uncertainty. Additionally, we include attention mechanisms to fuse different modalities, allowing the model to focus on what's important for each task. We tested our approach using the comprehensive multimodal MIMIC dataset, combining MIMIC-IV and MIMIC-CXR datasets. Our experiments show that our method is effective in improving multimodal deep learning for clinical applications. The code for this work is publicly available at: https://github.com/AliRasekh/TSImageFusion.

ASJC Scopus Sachgebiete

Zitieren

Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction. / Rasekh, Ali; Heidari, Reza; Hosein Haji Mohammad Rezaie, Amir et al.
in: IEEE ACCESS, Jahrgang 12, 13.11.2024, S. 174107-174121.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Rasekh, A, Heidari, R, Hosein Haji Mohammad Rezaie, A, Sharifi Sedeh, P, Ahmadi, Z, Mitra, P & Nejdl, W 2024, 'Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction', IEEE ACCESS, Jg. 12, S. 174107-174121. https://doi.org/10.1109/ACCESS.2024.3497668
Rasekh, A., Heidari, R., Hosein Haji Mohammad Rezaie, A., Sharifi Sedeh, P., Ahmadi, Z., Mitra, P., & Nejdl, W. (2024). Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction. IEEE ACCESS, 12, 174107-174121. https://doi.org/10.1109/ACCESS.2024.3497668
Rasekh A, Heidari R, Hosein Haji Mohammad Rezaie A, Sharifi Sedeh P, Ahmadi Z, Mitra P et al. Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction. IEEE ACCESS. 2024 Nov 13;12:174107-174121. doi: 10.1109/ACCESS.2024.3497668
Rasekh, Ali ; Heidari, Reza ; Hosein Haji Mohammad Rezaie, Amir et al. / Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction. in: IEEE ACCESS. 2024 ; Jahrgang 12. S. 174107-174121.
Download
@article{b8b931e3d94f41a5b4336d480afd2052,
title = "Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction",
abstract = "With the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenotyping where using different modalities of data could significantly improve our ability to predict. To tackle this challenge, we introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information. Apart from the technical challenges, our goal is to make the predictive model more robust in noisy conditions and perform better than current methods. We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results while simultaneously providing a principled means of modeling uncertainty. Additionally, we include attention mechanisms to fuse different modalities, allowing the model to focus on what's important for each task. We tested our approach using the comprehensive multimodal MIMIC dataset, combining MIMIC-IV and MIMIC-CXR datasets. Our experiments show that our method is effective in improving multimodal deep learning for clinical applications. The code for this work is publicly available at: https://github.com/AliRasekh/TSImageFusion.",
keywords = "attention mechanism, Multimodal learning, phenotyping, robustness, time series",
author = "Ali Rasekh and Reza Heidari and {Hosein Haji Mohammad Rezaie}, Amir and {Sharifi Sedeh}, Parsa and Zahra Ahmadi and Prasenjit Mitra and Wolfgang Nejdl",
note = "Publisher Copyright: {\textcopyright} 2024 The Authors.",
year = "2024",
month = nov,
day = "13",
doi = "10.1109/ACCESS.2024.3497668",
language = "English",
volume = "12",
pages = "174107--174121",
journal = "IEEE ACCESS",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Download

TY - JOUR

T1 - Robust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction

AU - Rasekh, Ali

AU - Heidari, Reza

AU - Hosein Haji Mohammad Rezaie, Amir

AU - Sharifi Sedeh, Parsa

AU - Ahmadi, Zahra

AU - Mitra, Prasenjit

AU - Nejdl, Wolfgang

N1 - Publisher Copyright: © 2024 The Authors.

PY - 2024/11/13

Y1 - 2024/11/13

N2 - With the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenotyping where using different modalities of data could significantly improve our ability to predict. To tackle this challenge, we introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information. Apart from the technical challenges, our goal is to make the predictive model more robust in noisy conditions and perform better than current methods. We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results while simultaneously providing a principled means of modeling uncertainty. Additionally, we include attention mechanisms to fuse different modalities, allowing the model to focus on what's important for each task. We tested our approach using the comprehensive multimodal MIMIC dataset, combining MIMIC-IV and MIMIC-CXR datasets. Our experiments show that our method is effective in improving multimodal deep learning for clinical applications. The code for this work is publicly available at: https://github.com/AliRasekh/TSImageFusion.

AB - With the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenotyping where using different modalities of data could significantly improve our ability to predict. To tackle this challenge, we introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information. Apart from the technical challenges, our goal is to make the predictive model more robust in noisy conditions and perform better than current methods. We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results while simultaneously providing a principled means of modeling uncertainty. Additionally, we include attention mechanisms to fuse different modalities, allowing the model to focus on what's important for each task. We tested our approach using the comprehensive multimodal MIMIC dataset, combining MIMIC-IV and MIMIC-CXR datasets. Our experiments show that our method is effective in improving multimodal deep learning for clinical applications. The code for this work is publicly available at: https://github.com/AliRasekh/TSImageFusion.

KW - attention mechanism

KW - Multimodal learning

KW - phenotyping

KW - robustness

KW - time series

UR - http://www.scopus.com/inward/record.url?scp=85210390662&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2024.3497668

DO - 10.1109/ACCESS.2024.3497668

M3 - Article

AN - SCOPUS:85210390662

VL - 12

SP - 174107

EP - 174121

JO - IEEE ACCESS

JF - IEEE ACCESS

SN - 2169-3536

ER -

Von denselben Autoren