Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Gullal S. Cheema
  • Sherzod Hakimov
  • Ralph Ewerth

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationCLEF 2020 Working Notes
Subtitle of host publicationWorking Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum
Publication statusPublished - 2020
Event11th Conference and Labs of the Evaluation Forum, CLEF 2020 - Thessaloniki, Greece
Duration: 22 Sept 202025 Sept 2020

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR Workshop Proceedings
Volume2696
ISSN (Print)1613-0073

Abstract

In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.

Keywords

    BERT, Check-worthiness, COVID-19, Fact-checking, Retrieval, Social media, SVM, Text classification, Twitter

ASJC Scopus subject areas

Cite this

Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features. / Cheema, Gullal S.; Hakimov, Sherzod; Ewerth, Ralph.
CLEF 2020 Working Notes: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. 2020. (CEUR Workshop Proceedings; Vol. 2696).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Cheema, GS, Hakimov, S & Ewerth, R 2020, Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features. in CLEF 2020 Working Notes: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. CEUR Workshop Proceedings, vol. 2696, 11th Conference and Labs of the Evaluation Forum, CLEF 2020, Thessaloniki, Greece, 22 Sept 2020. https://doi.org/10.48550/arXiv.2007.10534
Cheema, G. S., Hakimov, S., & Ewerth, R. (2020). Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features. In CLEF 2020 Working Notes: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum (CEUR Workshop Proceedings; Vol. 2696). https://doi.org/10.48550/arXiv.2007.10534
Cheema GS, Hakimov S, Ewerth R. Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features. In CLEF 2020 Working Notes: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. 2020. (CEUR Workshop Proceedings). doi: 10.48550/arXiv.2007.10534
Cheema, Gullal S. ; Hakimov, Sherzod ; Ewerth, Ralph. / Check square at CheckThat! 2020 : Claim Detection in Social Media via Fusion of Transformer and Syntactic Features. CLEF 2020 Working Notes: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. 2020. (CEUR Workshop Proceedings).
Download
@inproceedings{6e0b6ce0398b4612bddde17d760d722e,
title = "Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features",
abstract = "In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.",
keywords = "BERT, Check-worthiness, COVID-19, Fact-checking, Retrieval, Social media, SVM, Text classification, Twitter",
author = "Cheema, {Gullal S.} and Sherzod Hakimov and Ralph Ewerth",
note = "Funding Information: This project has received funding from the European Union{\textquoteright}s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997. ; 11th Conference and Labs of the Evaluation Forum, CLEF 2020 ; Conference date: 22-09-2020 Through 25-09-2020",
year = "2020",
doi = "10.48550/arXiv.2007.10534",
language = "English",
series = "CEUR Workshop Proceedings",
publisher = "CEUR Workshop Proceedings",
booktitle = "CLEF 2020 Working Notes",

}

Download

TY - GEN

T1 - Check square at CheckThat! 2020

T2 - 11th Conference and Labs of the Evaluation Forum, CLEF 2020

AU - Cheema, Gullal S.

AU - Hakimov, Sherzod

AU - Ewerth, Ralph

N1 - Funding Information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997.

PY - 2020

Y1 - 2020

N2 - In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.

AB - In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.

KW - BERT

KW - Check-worthiness

KW - COVID-19

KW - Fact-checking

KW - Retrieval

KW - Social media

KW - SVM

KW - Text classification

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=85121794546&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2007.10534

DO - 10.48550/arXiv.2007.10534

M3 - Conference contribution

AN - SCOPUS:85121794546

T3 - CEUR Workshop Proceedings

BT - CLEF 2020 Working Notes

Y2 - 22 September 2020 through 25 September 2020

ER -