Unsupervised Video Summarization via Multi-source Features

Hussain Kanafani; Junaid Ahmed Ghauri; Sherzod Hakimov; Ralph Ewerth

doi:10.48550/arXiv.2105.12532

Details

Originalsprache	Englisch
Titel des Sammelwerks	ICMR 2021
Untertitel	Proceedings of the 2021 International Conference on Multimedia Retrieval
Seiten	466-470
Seitenumfang	5
ISBN (elektronisch)	9781450384636
Publikationsstatus	Veröffentlicht - 1 Sept. 2021
Veranstaltung	11th ACM International Conference on Multimedia Retrieval, ICMR 2021 - Taipei, Taiwan Dauer: 16 Nov. 2021 → 19 Nov. 2021

Publikationsreihe

Name	ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval

Abstract

Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.

ASJC Scopus Sachgebiete

Informatik (insg.)
Computergrafik und computergestütztes Design
Informatik (insg.)
Angewandte Informatik
Informatik (insg.)
Maschinelles Sehen und Mustererkennung
Informatik (insg.)
Mensch-Maschine-Interaktion
Informatik (insg.)
Software

Zitieren

Unsupervised Video Summarization via Multi-source Features. / Kanafani, Hussain; Ghauri, Junaid Ahmed; Hakimov, Sherzod et al.
ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. 2021. S. 466-470 (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Kanafani, H, Ghauri, JA, Hakimov, S & Ewerth, R 2021, Unsupervised Video Summarization via Multi-source Features. in ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval, S. 466-470, 11th ACM International Conference on Multimedia Retrieval, ICMR 2021, Taipei, Taiwan, 16 Nov. 2021. https://doi.org/10.48550/arXiv.2105.12532, https://doi.org/10.1145/3460426.3463597

Kanafani, H., Ghauri, J. A., Hakimov, S., & Ewerth, R. (2021). Unsupervised Video Summarization via Multi-source Features. In ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval (S. 466-470). (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval). https://doi.org/10.48550/arXiv.2105.12532, https://doi.org/10.1145/3460426.3463597

Kanafani H, Ghauri JA, Hakimov S, Ewerth R. Unsupervised Video Summarization via Multi-source Features. in ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. 2021. S. 466-470. (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval). doi: 10.48550/arXiv.2105.12532, 10.1145/3460426.3463597

Kanafani, Hussain ; Ghauri, Junaid Ahmed ; Hakimov, Sherzod et al. / Unsupervised Video Summarization via Multi-source Features. ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. 2021. S. 466-470 (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval).

Download

@inproceedings{0c8637f956bc49e0bba75d033c4a3edf,

title = "Unsupervised Video Summarization via Multi-source Features",

abstract = "Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.",

keywords = "Deep learning, Multi-source combination, Multi-source fusion, Unsupervised video summarization, Video analysis",

author = "Hussain Kanafani and Ghauri, {Junaid Ahmed} and Sherzod Hakimov and Ralph Ewerth",

year = "2021",

month = sep,

day = "1",

doi = "10.48550/arXiv.2105.12532",

language = "English",

series = "ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval",

pages = "466--470",

booktitle = "ICMR 2021",

note = "11th ACM International Conference on Multimedia Retrieval, ICMR 2021 ; Conference date: 16-11-2021 Through 19-11-2021",

}

Download

TY - GEN

T1 - Unsupervised Video Summarization via Multi-source Features

AU - Kanafani, Hussain

AU - Ghauri, Junaid Ahmed

AU - Hakimov, Sherzod

AU - Ewerth, Ralph

PY - 2021/9/1

Y1 - 2021/9/1

N2 - Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.

AB - Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.

KW - Deep learning

KW - Multi-source combination

KW - Multi-source fusion

KW - Unsupervised video summarization

KW - Video analysis

UR - http://www.scopus.com/inward/record.url?scp=85114886998&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2105.12532

DO - 10.48550/arXiv.2105.12532

M3 - Conference contribution

AN - SCOPUS:85114886998

T3 - ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval

SP - 466

EP - 470

BT - ICMR 2021

T2 - 11th ACM International Conference on Multimedia Retrieval, ICMR 2021

Y2 - 16 November 2021 through 19 November 2021

ER -

Research@Leibniz University

Unsupervised Video Summarization via Multi-source Features

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren