Few-Shot Event Classification in Images using Knowledge Graphs for Prompting

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Golsa Tahmasebzadeh
  • Matthias Springstein
  • Ralph Ewerth
  • Eric Müller-Budack

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationIEEE Winter Conference on Applications of Computer Vision
Subtitle of host publicationWACV 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7271-7280
Number of pages10
ISBN (electronic)9798350318920
ISBN (print)979-8-3503-1893-7
Publication statusPublished - 2024
EventIEEE/CVF Winter Conference on Applications of Computer Vision 2024 - Waikoloa, United States
Duration: 3 Jan 20248 Jan 2024

Abstract

Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.

Keywords

    Algorithms, Applications, Arts / games / social media, Vision + language and/or other modalities

ASJC Scopus subject areas

Cite this

Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. / Tahmasebzadeh, Golsa; Springstein, Matthias; Ewerth, Ralph et al.
IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc., 2024. p. 7271-7280.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Tahmasebzadeh, G, Springstein, M, Ewerth, R & Müller-Budack, E 2024, Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. in IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc., pp. 7271-7280, IEEE/CVF Winter Conference on Applications of Computer Vision 2024, Waikoloa, United States, 3 Jan 2024. https://doi.org/10.1109/WACV57701.2024.00712
Tahmasebzadeh, G., Springstein, M., Ewerth, R., & Müller-Budack, E. (2024). Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. In IEEE Winter Conference on Applications of Computer Vision: WACV 2024 (pp. 7271-7280). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WACV57701.2024.00712
Tahmasebzadeh G, Springstein M, Ewerth R, Müller-Budack E. Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. In IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc. 2024. p. 7271-7280 doi: 10.1109/WACV57701.2024.00712
Tahmasebzadeh, Golsa ; Springstein, Matthias ; Ewerth, Ralph et al. / Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc., 2024. pp. 7271-7280
Download
@inproceedings{6f9fed4e2b8a4006bdf5a6001e432cc4,
title = "Few-Shot Event Classification in Images using Knowledge Graphs for Prompting",
abstract = "Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.",
keywords = "Algorithms, Applications, Arts / games / social media, Vision + language and/or other modalities",
author = "Golsa Tahmasebzadeh and Matthias Springstein and Ralph Ewerth and Eric M{\"u}ller-Budack",
note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; IEEE/CVF Winter Conference on Applications of Computer Vision 2024, WACV ; Conference date: 03-01-2024 Through 08-01-2024",
year = "2024",
doi = "10.1109/WACV57701.2024.00712",
language = "English",
isbn = "979-8-3503-1893-7",
pages = "7271--7280",
booktitle = "IEEE Winter Conference on Applications of Computer Vision",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Download

TY - GEN

T1 - Few-Shot Event Classification in Images using Knowledge Graphs for Prompting

AU - Tahmasebzadeh, Golsa

AU - Springstein, Matthias

AU - Ewerth, Ralph

AU - Müller-Budack, Eric

N1 - Publisher Copyright: © 2024 IEEE.

PY - 2024

Y1 - 2024

N2 - Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.

AB - Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.

KW - Algorithms

KW - Applications

KW - Arts / games / social media

KW - Vision + language and/or other modalities

UR - http://www.scopus.com/inward/record.url?scp=85191986086&partnerID=8YFLogxK

U2 - 10.1109/WACV57701.2024.00712

DO - 10.1109/WACV57701.2024.00712

M3 - Conference contribution

AN - SCOPUS:85191986086

SN - 979-8-3503-1893-7

SP - 7271

EP - 7280

BT - IEEE Winter Conference on Applications of Computer Vision

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - IEEE/CVF Winter Conference on Applications of Computer Vision 2024

Y2 - 3 January 2024 through 8 January 2024

ER -