Knowledge-Aware Neural Networks for Medical Forum Question Classification

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Soumyadeep Roy
  • Sudip Chakraborty
  • Aishik Mandal
  • Gunjan Balde
  • Prakhar Sharma
  • Anandhavelu Natarajan
  • Megha Khosla
  • Shamik Sural
  • Niloy Ganguly

Research Organisations

External Research Organisations

  • Indian Institute of Technology Kharagpur (IITKGP)
  • Adobe Research
View graph of relations

Details

Original languageEnglish
Title of host publicationCIKM '21
Subtitle of host publicationProceedings of the 30th ACM International Conference on Information & Knowledge Management
PublisherAssociation for Computing Machinery (ACM)
Pages3398-3402
Number of pages5
ISBN (electronic)9781450384469
Publication statusPublished - 30 Oct 2021
Event30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australia
Duration: 1 Nov 20215 Nov 2021

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Abstract

Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

Keywords

    clinical text classification, online health communities

ASJC Scopus subject areas

Cite this

Knowledge-Aware Neural Networks for Medical Forum Question Classification. / Roy, Soumyadeep; Chakraborty, Sudip; Mandal, Aishik et al.
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM), 2021. p. 3398-3402 (International Conference on Information and Knowledge Management, Proceedings).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Roy, S, Chakraborty, S, Mandal, A, Balde, G, Sharma, P, Natarajan, A, Khosla, M, Sural, S & Ganguly, N 2021, Knowledge-Aware Neural Networks for Medical Forum Question Classification. in CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. International Conference on Information and Knowledge Management, Proceedings, Association for Computing Machinery (ACM), pp. 3398-3402, 30th ACM International Conference on Information and Knowledge Management, CIKM 2021, Virtual, Online, Australia, 1 Nov 2021. https://doi.org/10.48550/arXiv.2109.13141, https://doi.org/10.1145/3459637.3482128
Roy, S., Chakraborty, S., Mandal, A., Balde, G., Sharma, P., Natarajan, A., Khosla, M., Sural, S., & Ganguly, N. (2021). Knowledge-Aware Neural Networks for Medical Forum Question Classification. In CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management (pp. 3398-3402). (International Conference on Information and Knowledge Management, Proceedings). Association for Computing Machinery (ACM). https://doi.org/10.48550/arXiv.2109.13141, https://doi.org/10.1145/3459637.3482128
Roy S, Chakraborty S, Mandal A, Balde G, Sharma P, Natarajan A et al. Knowledge-Aware Neural Networks for Medical Forum Question Classification. In CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM). 2021. p. 3398-3402. (International Conference on Information and Knowledge Management, Proceedings). doi: 10.48550/arXiv.2109.13141, 10.1145/3459637.3482128
Roy, Soumyadeep ; Chakraborty, Sudip ; Mandal, Aishik et al. / Knowledge-Aware Neural Networks for Medical Forum Question Classification. CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM), 2021. pp. 3398-3402 (International Conference on Information and Knowledge Management, Proceedings).
Download
@inproceedings{ae021af247df4cd6a2935865c61c807f,
title = "Knowledge-Aware Neural Networks for Medical Forum Question Classification",
abstract = "Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.",
keywords = "clinical text classification, online health communities",
author = "Soumyadeep Roy and Sudip Chakraborty and Aishik Mandal and Gunjan Balde and Prakhar Sharma and Anandhavelu Natarajan and Megha Khosla and Shamik Sural and Niloy Ganguly",
note = "Funding Information: We propose MedBERT, a novel application of transformers-based dual encoder model, for MFQC task, which is also medical domain knowledge-aware. We contribute a multi-label MFQC dataset; Med-BERT achieves state-of-the-art performance on ICHI (accuracy of 0.7) and CADEC dataset (accuracy of 0.9 and macro F1 score of 0.71), and generalizes very well in low-resource settings. Through extensive experimentation, we learn that incorporating medical concept-bearing terms as side information, contribute significantly to MedBERT. We learn that certain target classes heavily depend on keywords, while others require one to learn optimal representation of medical context. An interesting future direction will be to extend MedBERT to structured prediction tasks like entity and relation prediction, or broadly link prediction. Instead of BERT, we will work with BioBERT [16], which is a domain-specific pretrained model trained on biomedical articles. Acknowledgements. This work is supported in part by the Institute PhD Fellowship of IIT Kharagpur, the Federal Ministry of Education and Research (BMBF), Germany under the project Leib-nizKILabor (grant no. 01DD20003), the Adobe-funded project titled “Computational Aspects and Role of Content for Persuasive Brand Positioning”, and IMPRINT-1 Project RCO (project no. 6537).; 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 ; Conference date: 01-11-2021 Through 05-11-2021",
year = "2021",
month = oct,
day = "30",
doi = "10.48550/arXiv.2109.13141",
language = "English",
series = "International Conference on Information and Knowledge Management, Proceedings",
publisher = "Association for Computing Machinery (ACM)",
pages = "3398--3402",
booktitle = "CIKM '21",
address = "United States",

}

Download

TY - GEN

T1 - Knowledge-Aware Neural Networks for Medical Forum Question Classification

AU - Roy, Soumyadeep

AU - Chakraborty, Sudip

AU - Mandal, Aishik

AU - Balde, Gunjan

AU - Sharma, Prakhar

AU - Natarajan, Anandhavelu

AU - Khosla, Megha

AU - Sural, Shamik

AU - Ganguly, Niloy

N1 - Funding Information: We propose MedBERT, a novel application of transformers-based dual encoder model, for MFQC task, which is also medical domain knowledge-aware. We contribute a multi-label MFQC dataset; Med-BERT achieves state-of-the-art performance on ICHI (accuracy of 0.7) and CADEC dataset (accuracy of 0.9 and macro F1 score of 0.71), and generalizes very well in low-resource settings. Through extensive experimentation, we learn that incorporating medical concept-bearing terms as side information, contribute significantly to MedBERT. We learn that certain target classes heavily depend on keywords, while others require one to learn optimal representation of medical context. An interesting future direction will be to extend MedBERT to structured prediction tasks like entity and relation prediction, or broadly link prediction. Instead of BERT, we will work with BioBERT [16], which is a domain-specific pretrained model trained on biomedical articles. Acknowledgements. This work is supported in part by the Institute PhD Fellowship of IIT Kharagpur, the Federal Ministry of Education and Research (BMBF), Germany under the project Leib-nizKILabor (grant no. 01DD20003), the Adobe-funded project titled “Computational Aspects and Role of Content for Persuasive Brand Positioning”, and IMPRINT-1 Project RCO (project no. 6537).

PY - 2021/10/30

Y1 - 2021/10/30

N2 - Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

AB - Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

KW - clinical text classification

KW - online health communities

UR - http://www.scopus.com/inward/record.url?scp=85119178431&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2109.13141

DO - 10.48550/arXiv.2109.13141

M3 - Conference contribution

AN - SCOPUS:85119178431

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 3398

EP - 3402

BT - CIKM '21

PB - Association for Computing Machinery (ACM)

T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021

Y2 - 1 November 2021 through 5 November 2021

ER -