Knowledge-Aware Neural Networks for Medical Forum Question Classification

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Soumyadeep Roy
  • Sudip Chakraborty
  • Aishik Mandal
  • Gunjan Balde
  • Prakhar Sharma
  • Anandhavelu Natarajan
  • Megha Khosla
  • Shamik Sural
  • Niloy Ganguly

Organisationseinheiten

Externe Organisationen

  • Indian Institute of Technology Kharagpur (IITKGP)
  • Adobe Research
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksCIKM '21
UntertitelProceedings of the 30th ACM International Conference on Information & Knowledge Management
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten3398-3402
Seitenumfang5
ISBN (elektronisch)9781450384469
PublikationsstatusVeröffentlicht - 30 Okt. 2021
Veranstaltung30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australien
Dauer: 1 Nov. 20215 Nov. 2021

Publikationsreihe

NameInternational Conference on Information and Knowledge Management, Proceedings

Abstract

Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

Zitieren

Knowledge-Aware Neural Networks for Medical Forum Question Classification. / Roy, Soumyadeep; Chakraborty, Sudip; Mandal, Aishik et al.
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM), 2021. S. 3398-3402 (International Conference on Information and Knowledge Management, Proceedings).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Roy, S, Chakraborty, S, Mandal, A, Balde, G, Sharma, P, Natarajan, A, Khosla, M, Sural, S & Ganguly, N 2021, Knowledge-Aware Neural Networks for Medical Forum Question Classification. in CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. International Conference on Information and Knowledge Management, Proceedings, Association for Computing Machinery (ACM), S. 3398-3402, 30th ACM International Conference on Information and Knowledge Management, CIKM 2021, Virtual, Online, Australien, 1 Nov. 2021. https://doi.org/10.48550/arXiv.2109.13141, https://doi.org/10.1145/3459637.3482128
Roy, S., Chakraborty, S., Mandal, A., Balde, G., Sharma, P., Natarajan, A., Khosla, M., Sural, S., & Ganguly, N. (2021). Knowledge-Aware Neural Networks for Medical Forum Question Classification. In CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management (S. 3398-3402). (International Conference on Information and Knowledge Management, Proceedings). Association for Computing Machinery (ACM). https://doi.org/10.48550/arXiv.2109.13141, https://doi.org/10.1145/3459637.3482128
Roy S, Chakraborty S, Mandal A, Balde G, Sharma P, Natarajan A et al. Knowledge-Aware Neural Networks for Medical Forum Question Classification. in CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM). 2021. S. 3398-3402. (International Conference on Information and Knowledge Management, Proceedings). doi: 10.48550/arXiv.2109.13141, 10.1145/3459637.3482128
Roy, Soumyadeep ; Chakraborty, Sudip ; Mandal, Aishik et al. / Knowledge-Aware Neural Networks for Medical Forum Question Classification. CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM), 2021. S. 3398-3402 (International Conference on Information and Knowledge Management, Proceedings).
Download
@inproceedings{ae021af247df4cd6a2935865c61c807f,
title = "Knowledge-Aware Neural Networks for Medical Forum Question Classification",
abstract = "Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.",
keywords = "clinical text classification, online health communities",
author = "Soumyadeep Roy and Sudip Chakraborty and Aishik Mandal and Gunjan Balde and Prakhar Sharma and Anandhavelu Natarajan and Megha Khosla and Shamik Sural and Niloy Ganguly",
note = "Funding Information: We propose MedBERT, a novel application of transformers-based dual encoder model, for MFQC task, which is also medical domain knowledge-aware. We contribute a multi-label MFQC dataset; Med-BERT achieves state-of-the-art performance on ICHI (accuracy of 0.7) and CADEC dataset (accuracy of 0.9 and macro F1 score of 0.71), and generalizes very well in low-resource settings. Through extensive experimentation, we learn that incorporating medical concept-bearing terms as side information, contribute significantly to MedBERT. We learn that certain target classes heavily depend on keywords, while others require one to learn optimal representation of medical context. An interesting future direction will be to extend MedBERT to structured prediction tasks like entity and relation prediction, or broadly link prediction. Instead of BERT, we will work with BioBERT [16], which is a domain-specific pretrained model trained on biomedical articles. Acknowledgements. This work is supported in part by the Institute PhD Fellowship of IIT Kharagpur, the Federal Ministry of Education and Research (BMBF), Germany under the project Leib-nizKILabor (grant no. 01DD20003), the Adobe-funded project titled “Computational Aspects and Role of Content for Persuasive Brand Positioning”, and IMPRINT-1 Project RCO (project no. 6537).; 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 ; Conference date: 01-11-2021 Through 05-11-2021",
year = "2021",
month = oct,
day = "30",
doi = "10.48550/arXiv.2109.13141",
language = "English",
series = "International Conference on Information and Knowledge Management, Proceedings",
publisher = "Association for Computing Machinery (ACM)",
pages = "3398--3402",
booktitle = "CIKM '21",
address = "United States",

}

Download

TY - GEN

T1 - Knowledge-Aware Neural Networks for Medical Forum Question Classification

AU - Roy, Soumyadeep

AU - Chakraborty, Sudip

AU - Mandal, Aishik

AU - Balde, Gunjan

AU - Sharma, Prakhar

AU - Natarajan, Anandhavelu

AU - Khosla, Megha

AU - Sural, Shamik

AU - Ganguly, Niloy

N1 - Funding Information: We propose MedBERT, a novel application of transformers-based dual encoder model, for MFQC task, which is also medical domain knowledge-aware. We contribute a multi-label MFQC dataset; Med-BERT achieves state-of-the-art performance on ICHI (accuracy of 0.7) and CADEC dataset (accuracy of 0.9 and macro F1 score of 0.71), and generalizes very well in low-resource settings. Through extensive experimentation, we learn that incorporating medical concept-bearing terms as side information, contribute significantly to MedBERT. We learn that certain target classes heavily depend on keywords, while others require one to learn optimal representation of medical context. An interesting future direction will be to extend MedBERT to structured prediction tasks like entity and relation prediction, or broadly link prediction. Instead of BERT, we will work with BioBERT [16], which is a domain-specific pretrained model trained on biomedical articles. Acknowledgements. This work is supported in part by the Institute PhD Fellowship of IIT Kharagpur, the Federal Ministry of Education and Research (BMBF), Germany under the project Leib-nizKILabor (grant no. 01DD20003), the Adobe-funded project titled “Computational Aspects and Role of Content for Persuasive Brand Positioning”, and IMPRINT-1 Project RCO (project no. 6537).

PY - 2021/10/30

Y1 - 2021/10/30

N2 - Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

AB - Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

KW - clinical text classification

KW - online health communities

UR - http://www.scopus.com/inward/record.url?scp=85119178431&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2109.13141

DO - 10.48550/arXiv.2109.13141

M3 - Conference contribution

AN - SCOPUS:85119178431

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 3398

EP - 3402

BT - CIKM '21

PB - Association for Computing Machinery (ACM)

T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021

Y2 - 1 November 2021 through 5 November 2021

ER -