Multimodal analytics for real-world news using measures of cross-modal entity consistency

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Eric Müller-Budack
  • Jonas Theiner
  • Sebastian Diering
  • Maximilian Idahl
  • Ralph Ewerth

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksICMR 2020
UntertitelProceedings of the 2020 International Conference on Multimedia Retrieval
ErscheinungsortNew York
Seiten16-25
Seitenumfang10
ISBN (elektronisch)9781450370875
PublikationsstatusVeröffentlicht - 8 Juni 2020
Veranstaltung10th ACM International Conference on Multimedia Retrieval, ICMR 2020 - Dublin, Irland
Dauer: 8 Juni 202011 Juni 2020

Abstract

The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.

ASJC Scopus Sachgebiete

Zitieren

Multimodal analytics for real-world news using measures of cross-modal entity consistency. / Müller-Budack, Eric; Theiner, Jonas; Diering, Sebastian et al.
ICMR 2020: Proceedings of the 2020 International Conference on Multimedia Retrieval. New York, 2020. S. 16-25.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Müller-Budack, E, Theiner, J, Diering, S, Idahl, M & Ewerth, R 2020, Multimodal analytics for real-world news using measures of cross-modal entity consistency. in ICMR 2020: Proceedings of the 2020 International Conference on Multimedia Retrieval. New York, S. 16-25, 10th ACM International Conference on Multimedia Retrieval, ICMR 2020, Dublin, Irland, 8 Juni 2020. https://doi.org/10.1145/3372278.3390670
Müller-Budack, E., Theiner, J., Diering, S., Idahl, M., & Ewerth, R. (2020). Multimodal analytics for real-world news using measures of cross-modal entity consistency. In ICMR 2020: Proceedings of the 2020 International Conference on Multimedia Retrieval (S. 16-25). https://doi.org/10.1145/3372278.3390670
Müller-Budack E, Theiner J, Diering S, Idahl M, Ewerth R. Multimodal analytics for real-world news using measures of cross-modal entity consistency. in ICMR 2020: Proceedings of the 2020 International Conference on Multimedia Retrieval. New York. 2020. S. 16-25 doi: 10.1145/3372278.3390670
Müller-Budack, Eric ; Theiner, Jonas ; Diering, Sebastian et al. / Multimodal analytics for real-world news using measures of cross-modal entity consistency. ICMR 2020: Proceedings of the 2020 International Conference on Multimedia Retrieval. New York, 2020. S. 16-25
Download
@inproceedings{459f31b0e9914e84a4bf1f175e2b900f,
title = "Multimodal analytics for real-world news using measures of cross-modal entity consistency",
abstract = "The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.",
keywords = "Cross-modal consistency, Cross-modal entity verification, Deep learning, Image repurposing detection, Multimodal retrieval",
author = "Eric M{\"u}ller-Budack and Jonas Theiner and Sebastian Diering and Maximilian Idahl and Ralph Ewerth",
note = "Funding information: This project has received funding from the European Union{\textquoteright}s Horizon 2020 research and innovation programme under the Marie Sk?odowska-Curie grant agreement no 812997, and the the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft, project number: 388420599). We are very grateful to Avishek Anand (L3S Research Center, Leibniz Universit{\"a}t Hannover) for his valuable comments that improved the quality of the paper. This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no 812997, and the the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft, project number: 388420599). We are very grateful to Avishek Anand (L3S Research Center, Leibniz Universitt Hannover) for his valuable comments that improved the quality of the paper. ; 10th ACM International Conference on Multimedia Retrieval, ICMR 2020 ; Conference date: 08-06-2020 Through 11-06-2020",
year = "2020",
month = jun,
day = "8",
doi = "10.1145/3372278.3390670",
language = "English",
pages = "16--25",
booktitle = "ICMR 2020",

}

Download

TY - GEN

T1 - Multimodal analytics for real-world news using measures of cross-modal entity consistency

AU - Müller-Budack, Eric

AU - Theiner, Jonas

AU - Diering, Sebastian

AU - Idahl, Maximilian

AU - Ewerth, Ralph

N1 - Funding information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk?odowska-Curie grant agreement no 812997, and the the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft, project number: 388420599). We are very grateful to Avishek Anand (L3S Research Center, Leibniz Universität Hannover) for his valuable comments that improved the quality of the paper. This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no 812997, and the the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft, project number: 388420599). We are very grateful to Avishek Anand (L3S Research Center, Leibniz Universitt Hannover) for his valuable comments that improved the quality of the paper.

PY - 2020/6/8

Y1 - 2020/6/8

N2 - The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.

AB - The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.

KW - Cross-modal consistency

KW - Cross-modal entity verification

KW - Deep learning

KW - Image repurposing detection

KW - Multimodal retrieval

UR - http://www.scopus.com/inward/record.url?scp=85086904454&partnerID=8YFLogxK

U2 - 10.1145/3372278.3390670

DO - 10.1145/3372278.3390670

M3 - Conference contribution

AN - SCOPUS:85086904454

SP - 16

EP - 25

BT - ICMR 2020

CY - New York

T2 - 10th ACM International Conference on Multimedia Retrieval, ICMR 2020

Y2 - 8 June 2020 through 11 June 2020

ER -