A feature analysis for multimodal news retrieval

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Golsa Tahmasebzadeh
  • Sherzod Hakimov
  • Eric Müller-Budack
  • Ralph Ewerth

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationCross-lingual Event-centric Open Analytics
Subtitle of host publicationProceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020)
Pages43-56
Number of pages14
Publication statusPublished - 2020
Externally publishedYes
Event1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020 - Heraklion, Crete, Greece
Duration: 3 Jun 2020 → …

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR WS
Volume2611
ISSN (Print)1613-0073

Abstract

Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.

Keywords

    Computer Vision, Multimodal Features, Multimodal News Retrieval, Natural Language Processing

ASJC Scopus subject areas

Cite this

A feature analysis for multimodal news retrieval. / Tahmasebzadeh, Golsa; Hakimov, Sherzod; Müller-Budack, Eric et al.
Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). 2020. p. 43-56 (CEUR Workshop Proceedings; Vol. 2611).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Tahmasebzadeh, G, Hakimov, S, Müller-Budack, E & Ewerth, R 2020, A feature analysis for multimodal news retrieval. in Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). CEUR Workshop Proceedings, vol. 2611, pp. 43-56, 1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020, Heraklion, Crete, Greece, 3 Jun 2020. https://doi.org/10.48550/arXiv.2007.06390
Tahmasebzadeh, G., Hakimov, S., Müller-Budack, E., & Ewerth, R. (2020). A feature analysis for multimodal news retrieval. In Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020) (pp. 43-56). (CEUR Workshop Proceedings; Vol. 2611). https://doi.org/10.48550/arXiv.2007.06390
Tahmasebzadeh G, Hakimov S, Müller-Budack E, Ewerth R. A feature analysis for multimodal news retrieval. In Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). 2020. p. 43-56. (CEUR Workshop Proceedings). doi: 10.48550/arXiv.2007.06390
Tahmasebzadeh, Golsa ; Hakimov, Sherzod ; Müller-Budack, Eric et al. / A feature analysis for multimodal news retrieval. Cross-lingual Event-centric Open Analytics: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020). 2020. pp. 43-56 (CEUR Workshop Proceedings).
Download
@inproceedings{dfab0fba49684055837473d2edc19405,
title = "A feature analysis for multimodal news retrieval",
abstract = "Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.",
keywords = "Computer Vision, Multimodal Features, Multimodal News Retrieval, Natural Language Processing",
author = "Golsa Tahmasebzadeh and Sherzod Hakimov and Eric M{\"u}ller-Budack and Ralph Ewerth",
note = "Funding information: This project has received funding from the European Union{\textquoteright}s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997.; 1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020 ; Conference date: 03-06-2020",
year = "2020",
doi = "10.48550/arXiv.2007.06390",
language = "English",
series = "CEUR Workshop Proceedings",
publisher = "CEUR WS",
pages = "43--56",
booktitle = "Cross-lingual Event-centric Open Analytics",

}

Download

TY - GEN

T1 - A feature analysis for multimodal news retrieval

AU - Tahmasebzadeh, Golsa

AU - Hakimov, Sherzod

AU - Müller-Budack, Eric

AU - Ewerth, Ralph

N1 - Funding information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement no 812997.

PY - 2020

Y1 - 2020

N2 - Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.

AB - Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five feature types for image and text and compare the performance of the retrieval system using different combinations. Experimental results show that retrieval results can be improved when considering both visual and textual information. In addition, it is observed that among textual features entity overlap outperforms word embeddings, while geolocation embeddings achieve better performance among visual features in the retrieval task.

KW - Computer Vision

KW - Multimodal Features

KW - Multimodal News Retrieval

KW - Natural Language Processing

UR - http://www.scopus.com/inward/record.url?scp=85091095061&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2007.06390

DO - 10.48550/arXiv.2007.06390

M3 - Conference contribution

AN - SCOPUS:85091095061

T3 - CEUR Workshop Proceedings

SP - 43

EP - 56

BT - Cross-lingual Event-centric Open Analytics

T2 - 1st International Workshop on Cross-Lingual Event-Centric Open Analytics, CLEOPATRA 2020

Y2 - 3 June 2020

ER -