Details
Original language | English |
---|---|
Title of host publication | CLEF 2020 Working Notes |
Subtitle of host publication | Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum |
Publication status | Published - 2020 |
Event | 11th Conference and Labs of the Evaluation Forum, CLEF 2020 - Thessaloniki, Greece Duration: 22 Sept 2020 → 25 Sept 2020 |
Publication series
Name | CEUR Workshop Proceedings |
---|---|
Publisher | CEUR Workshop Proceedings |
Volume | 2696 |
ISSN (Print) | 1613-0073 |
Abstract
In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.
Keywords
- BERT, Check-worthiness, COVID-19, Fact-checking, Retrieval, Social media, SVM, Text classification, Twitter
ASJC Scopus subject areas
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
CLEF 2020 Working Notes: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. 2020. (CEUR Workshop Proceedings; Vol. 2696).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Check square at CheckThat! 2020
T2 - 11th Conference and Labs of the Evaluation Forum, CLEF 2020
AU - Cheema, Gullal S.
AU - Hakimov, Sherzod
AU - Ewerth, Ralph
N1 - Funding Information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997.
PY - 2020
Y1 - 2020
N2 - In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.
AB - In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.
KW - BERT
KW - Check-worthiness
KW - COVID-19
KW - Fact-checking
KW - Retrieval
KW - Social media
KW - SVM
KW - Text classification
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85121794546&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2007.10534
DO - 10.48550/arXiv.2007.10534
M3 - Conference contribution
AN - SCOPUS:85121794546
T3 - CEUR Workshop Proceedings
BT - CLEF 2020 Working Notes
Y2 - 22 September 2020 through 25 September 2020
ER -