Details
Original language | English |
---|---|
Title of host publication | The Semantic Web |
Subtitle of host publication | 15th International Conference |
Editors | Aldo Gangemi, Raphaël Troncy, Roberto Navigli, Laura Hollink, Maria-Esther Vidal, Pascal Hitzler, Anna Tordai, Mehwish Alam |
Publisher | Springer Verlag |
Pages | 177-190 |
Number of pages | 14 |
ISBN (electronic) | 9783319934174 |
ISBN (print) | 9783319934167 |
Publication status | Published - 3 Jun 2018 |
Event | 15th International Conference on Extended Semantic Web Conference, ESWC 2018 - Heraklion, Greece Duration: 3 Jun 2018 → 7 Jun 2018 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10843 |
ISSN (Print) | 0302-9743 |
ISSN (electronic) | 1611-3349 |
Abstract
Publicly available social media archives facilitate research in a variety of fields, such as data science, sociology or the digital humanities, where Twitter has emerged as one of the most prominent sources. However, obtaining, archiving and annotating large amounts of tweets is costly. In this paper, we describe TweetsKB, a publicly available corpus of currently more than 1.5 billion tweets, spanning almost 5 years (Jan’13–Nov’17). Metadata information about the tweets as well as extracted entities, hashtags, user mentions and sentiment information are exposed using established RDF/S vocabularies. Next to a description of the extraction and annotation process, we present use cases to illustrate scenarios for entity-centric information exploration, data integration and knowledge discovery facilitated by TweetsKB.
Keywords
- Entity linking, RDF, Sentiment analysis, Social media archives, Twitter
ASJC Scopus subject areas
- Mathematics(all)
- Theoretical Computer Science
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
The Semantic Web: 15th International Conference. ed. / Aldo Gangemi; Raphaël Troncy; Roberto Navigli; Laura Hollink; Maria-Esther Vidal; Pascal Hitzler; Anna Tordai; Mehwish Alam. Springer Verlag, 2018. p. 177-190 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10843).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - TweetsKB
T2 - 15th International Conference on Extended Semantic Web Conference, ESWC 2018
AU - Fafalios, Pavlos
AU - Iosifidis, Vasileios
AU - Ntoutsi, Eirini
AU - Dietze, Stefan
N1 - Funding information:. The work was partially funded by the European Commission for the ERC Advanced Grant ALEXANDRIA under grant No. 339233 and the H2020 Grant No. 687916 (AFEL project), and by the German Research Foundation (DFG) project OSCAR (Opinion Stream Classification with Ensembles and Active leaRners). The work was partially funded by the European Commission for the ERC Advanced Grant ALEXANDRIA under grant No. 339233 and the H2020 Grant No. 687916 (AFEL project), and by the German Research Foundation (DFG) project OSCAR (Opinion Stream Classification with Ensembles and Active leaRners).
PY - 2018/6/3
Y1 - 2018/6/3
N2 - Publicly available social media archives facilitate research in a variety of fields, such as data science, sociology or the digital humanities, where Twitter has emerged as one of the most prominent sources. However, obtaining, archiving and annotating large amounts of tweets is costly. In this paper, we describe TweetsKB, a publicly available corpus of currently more than 1.5 billion tweets, spanning almost 5 years (Jan’13–Nov’17). Metadata information about the tweets as well as extracted entities, hashtags, user mentions and sentiment information are exposed using established RDF/S vocabularies. Next to a description of the extraction and annotation process, we present use cases to illustrate scenarios for entity-centric information exploration, data integration and knowledge discovery facilitated by TweetsKB.
AB - Publicly available social media archives facilitate research in a variety of fields, such as data science, sociology or the digital humanities, where Twitter has emerged as one of the most prominent sources. However, obtaining, archiving and annotating large amounts of tweets is costly. In this paper, we describe TweetsKB, a publicly available corpus of currently more than 1.5 billion tweets, spanning almost 5 years (Jan’13–Nov’17). Metadata information about the tweets as well as extracted entities, hashtags, user mentions and sentiment information are exposed using established RDF/S vocabularies. Next to a description of the extraction and annotation process, we present use cases to illustrate scenarios for entity-centric information exploration, data integration and knowledge discovery facilitated by TweetsKB.
KW - Entity linking
KW - RDF
KW - Sentiment analysis
KW - Social media archives
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85048487300&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-93417-4_12
DO - 10.1007/978-3-319-93417-4_12
M3 - Conference contribution
AN - SCOPUS:85048487300
SN - 9783319934167
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 177
EP - 190
BT - The Semantic Web
A2 - Gangemi, Aldo
A2 - Troncy, Raphaël
A2 - Navigli, Roberto
A2 - Hollink, Laura
A2 - Vidal, Maria-Esther
A2 - Hitzler, Pascal
A2 - Tordai, Anna
A2 - Alam, Mehwish
PB - Springer Verlag
Y2 - 3 June 2018 through 7 June 2018
ER -