Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts

José Alberto Benítez-Andrades; María Teresa García-Ordás; Mayra Russo; Ahmad Sakor; Luis Daniel Fernandes Rotger; Maria Esther Vidal

doi:10.3233/SW-223269

Details

Original language	English
Pages (from-to)	873-892
Number of pages	20
Journal	Semantic web
Volume	14
Issue number	5
Publication status	Published - 8 May 2023

Abstract

Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.

Keywords

deep learning, health data, knowledge graphs, Name entity linking, natural language processing, Wikidata

ASJC Scopus subject areas

Computer Science(all)
Information Systems
Computer Science(all)
Computer Science Applications
Computer Science(all)
Computer Networks and Communications

Sustainable Development Goals

SDG 3 - Good Health and Well-being

Cite this

Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts. / Benítez-Andrades, José Alberto; García-Ordás, María Teresa; Russo, Mayra et al.
In: Semantic web, Vol. 14, No. 5, 08.05.2023, p. 873-892.

Research output: Contribution to journal › Article › Research › peer review

Benítez-Andrades, JA, García-Ordás, MT, Russo, M, Sakor, A, Fernandes Rotger, LD & Vidal, ME 2023, 'Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts', Semantic web, vol. 14, no. 5, pp. 873-892. https://doi.org/10.3233/SW-223269

Benítez-Andrades, J. A., García-Ordás, M. T., Russo, M., Sakor, A., Fernandes Rotger, L. D., & Vidal, M. E. (2023). Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts. Semantic web, 14(5), 873-892. https://doi.org/10.3233/SW-223269

Benítez-Andrades JA, García-Ordás MT, Russo M, Sakor A, Fernandes Rotger LD, Vidal ME. Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts. Semantic web. 2023 May 8;14(5):873-892. doi: 10.3233/SW-223269

Benítez-Andrades, José Alberto ; García-Ordás, María Teresa ; Russo, Mayra et al. / Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts. In: Semantic web. 2023 ; Vol. 14, No. 5. pp. 873-892.

Download

@article{1407059b79214b3298ac91fa938448ac,

title = "Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts",

abstract = "Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.",

keywords = "deep learning, health data, knowledge graphs, Name entity linking, natural language processing, Wikidata",

author = "Ben{\'i}tez-Andrades, {Jos{\'e} Alberto} and Garc{\'i}a-Ord{\'a}s, {Mar{\'i}a Teresa} and Mayra Russo and Ahmad Sakor and {Fernandes Rotger}, {Luis Daniel} and Vidal, {Maria Esther}",

note = "Funding Information: Part of this research was funded by the European Union's Horizon 2020 research and innovation programme under Marie Sklodowska-Curie Actions (grant agreement number 860630) for the project {"}NoBIAS - Artificial Intelligence without Bias{"}. This work reflects only the authors' views, and the European Research Executive Agency (REA) is not responsible for any use that may be made of the information it contains. Furthermore, Maria-Esther Vidal is partially supported by Leibniz Association in the program {"}Leibniz Best Minds: Programme for Women Professors{"}, project TrustKG-Transforming Data in Trustable Insights with grant P99/2020. ",

year = "2023",

month = may,

day = "8",

doi = "10.3233/SW-223269",

language = "English",

volume = "14",

pages = "873--892",

journal = "Semantic web",

issn = "1570-0844",

publisher = "IOS Press",

number = "5",

}

Download

TY - JOUR

T1 - Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts

AU - Benítez-Andrades, José Alberto

AU - García-Ordás, María Teresa

AU - Russo, Mayra

AU - Sakor, Ahmad

AU - Fernandes Rotger, Luis Daniel

AU - Vidal, Maria Esther

N1 - Funding Information: Part of this research was funded by the European Union's Horizon 2020 research and innovation programme under Marie Sklodowska-Curie Actions (grant agreement number 860630) for the project "NoBIAS - Artificial Intelligence without Bias". This work reflects only the authors' views, and the European Research Executive Agency (REA) is not responsible for any use that may be made of the information it contains. Furthermore, Maria-Esther Vidal is partially supported by Leibniz Association in the program "Leibniz Best Minds: Programme for Women Professors", project TrustKG-Transforming Data in Trustable Insights with grant P99/2020.

PY - 2023/5/8

Y1 - 2023/5/8

N2 - Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.

AB - Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.

KW - deep learning

KW - health data

KW - knowledge graphs

KW - Name entity linking

KW - natural language processing

KW - Wikidata

UR - http://www.scopus.com/inward/record.url?scp=85168387508&partnerID=8YFLogxK

U2 - 10.3233/SW-223269

DO - 10.3233/SW-223269

M3 - Article

AN - SCOPUS:85168387508

VL - 14

SP - 873

EP - 892

JO - Semantic web

JF - Semantic web

SN - 1570-0844

IS - 5

ER -

Research@Leibniz University