A feature analysis for multimodal news retrieval

Golsa Tahmasebzadeh; Sherzod Hakimov; Eric Müller-Budack; Ralph Ewerth

doi:10.48550/arXiv.2007.06390

Details

Original language	English
Title of host publication	Cross-lingual Event-centric Open Analytics
Subtitle of host publication	Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020)
Pages	43-56
Number of pages	14
Publication status	Published - 2020
Externally published	Yes
Event	1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020 - Heraklion, Crete, Greece Duration: 3 Jun 2020 → …

Publication series

Name	CEUR Workshop Proceedings
Publisher	CEUR WS
Volume	2611
ISSN (Print)	1613-0073

Abstract

Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.

Keywords

Computer Vision, Multimodal Features, Multimodal News Retrieval, Natural Language Processing

ASJC Scopus subject areas

Computer Science(all)
General Computer Science

Cite this

A feature analysis for multimodal news retrieval. / Tahmasebzadeh, Golsa; Hakimov, Sherzod; Müller-Budack, Eric et al.
Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). 2020. p. 43-56 (CEUR Workshop Proceedings; Vol. 2611).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Tahmasebzadeh, G, Hakimov, S, Müller-Budack, E & Ewerth, R 2020, A feature analysis for multimodal news retrieval. in Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). CEUR Workshop Proceedings, vol. 2611, pp. 43-56, 1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020, Heraklion, Crete, Greece, 3 Jun 2020. https://doi.org/10.48550/arXiv.2007.06390

Tahmasebzadeh, G., Hakimov, S., Müller-Budack, E., & Ewerth, R. (2020). A feature analysis for multimodal news retrieval. In Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020) (pp. 43-56). (CEUR Workshop Proceedings; Vol. 2611). https://doi.org/10.48550/arXiv.2007.06390

Tahmasebzadeh G, Hakimov S, Müller-Budack E, Ewerth R. A feature analysis for multimodal news retrieval. In Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). 2020. p. 43-56. (CEUR Workshop Proceedings). doi: 10.48550/arXiv.2007.06390

Tahmasebzadeh, Golsa ; Hakimov, Sherzod ; Müller-Budack, Eric et al. / A feature analysis for multimodal news retrieval. Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). 2020. pp. 43-56 (CEUR Workshop Proceedings).

Download

@inproceedings{dfab0fba49684055837473d2edc19405,

title = "A feature analysis for multimodal news retrieval",

abstract = "Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.",

keywords = "Computer Vision, Multimodal Features, Multimodal News Retrieval, Natural Language Processing",

author = "Golsa Tahmasebzadeh and Sherzod Hakimov and Eric M{\"u}ller-Budack and Ralph Ewerth",

note = "Funding information: This project has received funding from the European Union{\textquoteright}s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997.; 1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020 ; Conference date: 03-06-2020",

year = "2020",

doi = "10.48550/arXiv.2007.06390",

language = "English",

series = "CEUR Workshop Proceedings",

publisher = "CEUR WS",

pages = "43--56",

booktitle = "Cross-lingual Event-centric Open Analytics",

}

Download

TY - GEN

T1 - A feature analysis for multimodal news retrieval

AU - Tahmasebzadeh, Golsa

AU - Hakimov, Sherzod

AU - Müller-Budack, Eric

AU - Ewerth, Ralph

N1 - Funding information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997.

PY - 2020

Y1 - 2020

N2 - Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.

AB - Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.

KW - Computer Vision

KW - Multimodal Features

KW - Multimodal News Retrieval

KW - Natural Language Processing

UR - http://www.scopus.com/inward/record.url?scp=85091095061&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2007.06390

DO - 10.48550/arXiv.2007.06390

M3 - Conference contribution

AN - SCOPUS:85091095061

T3 - CEUR Workshop Proceedings

SP - 43

EP - 56

BT - Cross-lingual Event-centric Open Analytics

T2 - 1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020

Y2 - 3 June 2020

ER -

Research@Leibniz University