Details
Original language | English |
---|---|
Pages (from-to) | 873-892 |
Number of pages | 20 |
Journal | Semantic web |
Volume | 14 |
Issue number | 5 |
Publication status | Published - 8 May 2023 |
Abstract
Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.
Keywords
- deep learning, health data, knowledge graphs, Name entity linking, natural language processing, Wikidata
ASJC Scopus subject areas
- Computer Science(all)
- Information Systems
- Computer Science(all)
- Computer Science Applications
- Computer Science(all)
- Computer Networks and Communications
Sustainable Development Goals
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Semantic web, Vol. 14, No. 5, 08.05.2023, p. 873-892.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts
AU - Benítez-Andrades, José Alberto
AU - García-Ordás, María Teresa
AU - Russo, Mayra
AU - Sakor, Ahmad
AU - Fernandes Rotger, Luis Daniel
AU - Vidal, Maria Esther
N1 - Funding Information: Part of this research was funded by the European Union's Horizon 2020 research and innovation programme under Marie Sklodowska-Curie Actions (grant agreement number 860630) for the project "NoBIAS - Artificial Intelligence without Bias". This work reflects only the authors' views, and the European Research Executive Agency (REA) is not responsible for any use that may be made of the information it contains. Furthermore, Maria-Esther Vidal is partially supported by Leibniz Association in the program "Leibniz Best Minds: Programme for Women Professors", project TrustKG-Transforming Data in Trustable Insights with grant P99/2020.
PY - 2023/5/8
Y1 - 2023/5/8
N2 - Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.
AB - Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine.
KW - deep learning
KW - health data
KW - knowledge graphs
KW - Name entity linking
KW - natural language processing
KW - Wikidata
UR - http://www.scopus.com/inward/record.url?scp=85168387508&partnerID=8YFLogxK
U2 - 10.3233/SW-223269
DO - 10.3233/SW-223269
M3 - Article
AN - SCOPUS:85168387508
VL - 14
SP - 873
EP - 892
JO - Semantic web
JF - Semantic web
SN - 1570-0844
IS - 5
ER -