Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021 |
Seiten | 380-384 |
Seitenumfang | 5 |
ISBN (elektronisch) | 9781450383134 |
Publikationsstatus | Veröffentlicht - 3 Juni 2021 |
Veranstaltung | 30th World Wide Web Conference, WWW 2021 - Ljubljana, Slowenien Dauer: 19 Apr. 2021 → 23 Apr. 2021 |
Publikationsreihe
Name | The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021 |
---|
Abstract
News media reflects the present state of a country or region to its audiences. Media outlets of a region post different kinds of news for their local and global audiences. In this paper, we focus on Europe (precisely EU) and propose a method to identify news that has an impact on Europe from any aspect such as financial, business, crime, politics, etc. Predicting the location of the news is itself a challenging task. Most of the approaches restrict themselves towards named entities or handcrafted features. In this paper, we try to overcome that limitation i.e., instead of focusing only on the named entities (Europe location, politicians etc.) and some hand-crafted rules, we also explore the context of news articles with the help of pre-Trained language model BERT. The auto-regressive language model based European news detector shows about 9-19% improvement in terms of F-score over baseline models. Interestingly, we observe that such models automatically capture named entities, their origin, etc; hence, no separate information is required. We also evaluate the role of such entities in the prediction and explore the tokens that BERT really looks at for deciding the news category. Entities such as person, location, organization turn out to be good rationale tokens for the prediction.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Computernetzwerke und -kommunikation
- Informatik (insg.)
- Software
Ziele für nachhaltige Entwicklung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021. 2021. S. 380-384 (The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Eudetector
T2 - 30th World Wide Web Conference, WWW 2021
AU - Rudra, Koustav
AU - Tran, Danny
AU - Shaltev, Miroslav
N1 - Funding Information: Funding for this project was in part provided by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 832921.
PY - 2021/6/3
Y1 - 2021/6/3
N2 - News media reflects the present state of a country or region to its audiences. Media outlets of a region post different kinds of news for their local and global audiences. In this paper, we focus on Europe (precisely EU) and propose a method to identify news that has an impact on Europe from any aspect such as financial, business, crime, politics, etc. Predicting the location of the news is itself a challenging task. Most of the approaches restrict themselves towards named entities or handcrafted features. In this paper, we try to overcome that limitation i.e., instead of focusing only on the named entities (Europe location, politicians etc.) and some hand-crafted rules, we also explore the context of news articles with the help of pre-Trained language model BERT. The auto-regressive language model based European news detector shows about 9-19% improvement in terms of F-score over baseline models. Interestingly, we observe that such models automatically capture named entities, their origin, etc; hence, no separate information is required. We also evaluate the role of such entities in the prediction and explore the tokens that BERT really looks at for deciding the news category. Entities such as person, location, organization turn out to be good rationale tokens for the prediction.
AB - News media reflects the present state of a country or region to its audiences. Media outlets of a region post different kinds of news for their local and global audiences. In this paper, we focus on Europe (precisely EU) and propose a method to identify news that has an impact on Europe from any aspect such as financial, business, crime, politics, etc. Predicting the location of the news is itself a challenging task. Most of the approaches restrict themselves towards named entities or handcrafted features. In this paper, we try to overcome that limitation i.e., instead of focusing only on the named entities (Europe location, politicians etc.) and some hand-crafted rules, we also explore the context of news articles with the help of pre-Trained language model BERT. The auto-regressive language model based European news detector shows about 9-19% improvement in terms of F-score over baseline models. Interestingly, we observe that such models automatically capture named entities, their origin, etc; hence, no separate information is required. We also evaluate the role of such entities in the prediction and explore the tokens that BERT really looks at for deciding the news category. Entities such as person, location, organization turn out to be good rationale tokens for the prediction.
UR - http://www.scopus.com/inward/record.url?scp=85107673999&partnerID=8YFLogxK
U2 - 10.1145/3442442.3452324
DO - 10.1145/3442442.3452324
M3 - Conference contribution
AN - SCOPUS:85107673999
T3 - The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021
SP - 380
EP - 384
BT - The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021
Y2 - 19 April 2021 through 23 April 2021
ER -