TamperedNews & News400 (IJMIR'21 Update)

Details

Date made available	2022
Publisher	Forschungsdaten-Repositorium der LUH

Description

Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency Content For both datasets TamperedNews and News400, we provide the: *dataset*.tar.gz containing the *dataset*.jsonl with Web links to the news texts Web links to the news image Outputs of the named entity recognition and disambiguation (NERD) approach Untampered and tampered entities *dataset*_features.tar.gzwith visual features for events, locations, and persons news400_wordembeddings.tar.gz: Word embeddings of all nouns in the news texts of the News400 dataset Please note that the word embeddings of the TamperedNews dataset (tamperednews_wordembeddings.tar.gz) have been already provided in the first version (Link). For all entities detected in both datasets, we provide: entities.tar.gz containing an *entity_type*.jsonl for all entity types (events, locations, and persons) with: Wikidata ID Wikidata label Meta information used for tampering Web links to all reference images crawled from Google, Bing, and Wikidata entities_features.tar.gz containing the visual features of the reference images for all entities Source Code The source code to reproduce our results as well as download scripts to crawl news texts and images can be found on our GitHub page: https://github.com/TIBHannover/cross-modal_entity_consistency

Research@Leibniz University

Researchers

Research Organisations

External organisation

Details

Description