Few-Shot Event Classification in Images using Knowledge Graphs for Prompting

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Golsa Tahmasebzadeh
  • Matthias Springstein
  • Ralph Ewerth
  • Eric Müller-Budack

Organisationseinheiten

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksIEEE Winter Conference on Applications of Computer Vision
UntertitelWACV 2024
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten7271-7280
Seitenumfang10
ISBN (elektronisch)9798350318920
ISBN (Print)979-8-3503-1893-7
PublikationsstatusVeröffentlicht - 2024
VeranstaltungIEEE/CVF Winter Conference on Applications of Computer Vision 2024 - Waikoloa, USA / Vereinigte Staaten
Dauer: 3 Jan. 20248 Jan. 2024

Abstract

Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.

ASJC Scopus Sachgebiete

Zitieren

Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. / Tahmasebzadeh, Golsa; Springstein, Matthias; Ewerth, Ralph et al.
IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc., 2024. S. 7271-7280.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Tahmasebzadeh, G, Springstein, M, Ewerth, R & Müller-Budack, E 2024, Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. in IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc., S. 7271-7280, IEEE/CVF Winter Conference on Applications of Computer Vision 2024, Waikoloa, USA / Vereinigte Staaten, 3 Jan. 2024. https://doi.org/10.1109/WACV57701.2024.00712
Tahmasebzadeh, G., Springstein, M., Ewerth, R., & Müller-Budack, E. (2024). Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. In IEEE Winter Conference on Applications of Computer Vision: WACV 2024 (S. 7271-7280). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WACV57701.2024.00712
Tahmasebzadeh G, Springstein M, Ewerth R, Müller-Budack E. Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. in IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc. 2024. S. 7271-7280 doi: 10.1109/WACV57701.2024.00712
Tahmasebzadeh, Golsa ; Springstein, Matthias ; Ewerth, Ralph et al. / Few-Shot Event Classification in Images using Knowledge Graphs for Prompting. IEEE Winter Conference on Applications of Computer Vision: WACV 2024. Institute of Electrical and Electronics Engineers Inc., 2024. S. 7271-7280
Download
@inproceedings{6f9fed4e2b8a4006bdf5a6001e432cc4,
title = "Few-Shot Event Classification in Images using Knowledge Graphs for Prompting",
abstract = "Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.",
keywords = "Algorithms, Applications, Arts / games / social media, Vision + language and/or other modalities",
author = "Golsa Tahmasebzadeh and Matthias Springstein and Ralph Ewerth and Eric M{\"u}ller-Budack",
note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; IEEE/CVF Winter Conference on Applications of Computer Vision 2024, WACV ; Conference date: 03-01-2024 Through 08-01-2024",
year = "2024",
doi = "10.1109/WACV57701.2024.00712",
language = "English",
isbn = "979-8-3503-1893-7",
pages = "7271--7280",
booktitle = "IEEE Winter Conference on Applications of Computer Vision",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Download

TY - GEN

T1 - Few-Shot Event Classification in Images using Knowledge Graphs for Prompting

AU - Tahmasebzadeh, Golsa

AU - Springstein, Matthias

AU - Ewerth, Ralph

AU - Müller-Budack, Eric

N1 - Publisher Copyright: © 2024 IEEE.

PY - 2024

Y1 - 2024

N2 - Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.

AB - Event classification in images plays a vital role in multimedia analysis especially with the prevalence of fake news on social media and the Web. The majority of approaches for event classification rely on large sets of labeled training data. However, image labels for fine-grained event instances (e.g., 2016 Summer Olympics) can be sparse, incorrect, ambiguous, etc. A few approaches have addressed the lack of labeled data for event classification but cover only few events. Moreover, vision-language models that allow for zero-shot and few-shot classification with prompting have not yet been extensively exploited. In this paper, we propose four different techniques to create hard prompts including knowledge graph information from Wikidata and Wikipedia as well as an ensemble approach for zero-shot event classification. We also integrate prompt learning for state-of-the-art vision-language models to address few-shot event classification. Experimental results on six benchmarks including a new dataset comprising event instances from various domains, such as politics and natural disasters, show that our proposed approaches require much fewer training images than supervised baselines and the state-of-the-art while achieving better results.

KW - Algorithms

KW - Applications

KW - Arts / games / social media

KW - Vision + language and/or other modalities

UR - http://www.scopus.com/inward/record.url?scp=85191986086&partnerID=8YFLogxK

U2 - 10.1109/WACV57701.2024.00712

DO - 10.1109/WACV57701.2024.00712

M3 - Conference contribution

AN - SCOPUS:85191986086

SN - 979-8-3503-1893-7

SP - 7271

EP - 7280

BT - IEEE Winter Conference on Applications of Computer Vision

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - IEEE/CVF Winter Conference on Applications of Computer Vision 2024

Y2 - 3 January 2024 through 8 January 2024

ER -