Unsupervised Video Summarization via Multi-source Features

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Hussain Kanafani
  • Junaid Ahmed Ghauri
  • Sherzod Hakimov
  • Ralph Ewerth

Organisationseinheiten

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksICMR 2021
UntertitelProceedings of the 2021 International Conference on Multimedia Retrieval
Seiten466-470
Seitenumfang5
ISBN (elektronisch)9781450384636
PublikationsstatusVeröffentlicht - 1 Sept. 2021
Veranstaltung11th ACM International Conference on Multimedia Retrieval, ICMR 2021 - Taipei, Taiwan
Dauer: 16 Nov. 202119 Nov. 2021

Publikationsreihe

NameICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval

Abstract

Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.

ASJC Scopus Sachgebiete

Zitieren

Unsupervised Video Summarization via Multi-source Features. / Kanafani, Hussain; Ghauri, Junaid Ahmed; Hakimov, Sherzod et al.
ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. 2021. S. 466-470 (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Kanafani, H, Ghauri, JA, Hakimov, S & Ewerth, R 2021, Unsupervised Video Summarization via Multi-source Features. in ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval, S. 466-470, 11th ACM International Conference on Multimedia Retrieval, ICMR 2021, Taipei, Taiwan, 16 Nov. 2021. https://doi.org/10.48550/arXiv.2105.12532, https://doi.org/10.1145/3460426.3463597
Kanafani, H., Ghauri, J. A., Hakimov, S., & Ewerth, R. (2021). Unsupervised Video Summarization via Multi-source Features. In ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval (S. 466-470). (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval). https://doi.org/10.48550/arXiv.2105.12532, https://doi.org/10.1145/3460426.3463597
Kanafani H, Ghauri JA, Hakimov S, Ewerth R. Unsupervised Video Summarization via Multi-source Features. in ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. 2021. S. 466-470. (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval). doi: 10.48550/arXiv.2105.12532, 10.1145/3460426.3463597
Kanafani, Hussain ; Ghauri, Junaid Ahmed ; Hakimov, Sherzod et al. / Unsupervised Video Summarization via Multi-source Features. ICMR 2021: Proceedings of the 2021 International Conference on Multimedia Retrieval. 2021. S. 466-470 (ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval).
Download
@inproceedings{0c8637f956bc49e0bba75d033c4a3edf,
title = "Unsupervised Video Summarization via Multi-source Features",
abstract = "Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.",
keywords = "Deep learning, Multi-source combination, Multi-source fusion, Unsupervised video summarization, Video analysis",
author = "Hussain Kanafani and Ghauri, {Junaid Ahmed} and Sherzod Hakimov and Ralph Ewerth",
year = "2021",
month = sep,
day = "1",
doi = "10.48550/arXiv.2105.12532",
language = "English",
series = "ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval",
pages = "466--470",
booktitle = "ICMR 2021",
note = "11th ACM International Conference on Multimedia Retrieval, ICMR 2021 ; Conference date: 16-11-2021 Through 19-11-2021",

}

Download

TY - GEN

T1 - Unsupervised Video Summarization via Multi-source Features

AU - Kanafani, Hussain

AU - Ghauri, Junaid Ahmed

AU - Hakimov, Sherzod

AU - Ewerth, Ralph

PY - 2021/9/1

Y1 - 2021/9/1

N2 - Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.

AB - Video summarization aims at generating a compact yet representative visual summary that conveys the essence of the original video. The advantage of unsupervised approaches is that they do not require human annotations to learn the summarization capability and generalize to a wider range of domains. Previous work relies on the same type of deep features, typically based on a model pre-trained on ImageNet data. Therefore, we propose to incorporate multiple feature sources with chunk and stride fusion to provide more information about the visual content. For a comprehensive evaluation on the two benchmarks TVSum and SumMe, we compare our method with four state-of-the-art approaches. Two of these approaches were implemented by ourselves to reproduce the reported results. Our evaluation shows that we obtain state-of-the-art results on both datasets while also highlighting the shortcomings of previous work with regard to the evaluation methodology. Finally, we perform error analysis on videos for the two benchmark datasets to summarize and spot the factors that lead to misclassifications.

KW - Deep learning

KW - Multi-source combination

KW - Multi-source fusion

KW - Unsupervised video summarization

KW - Video analysis

UR - http://www.scopus.com/inward/record.url?scp=85114886998&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2105.12532

DO - 10.48550/arXiv.2105.12532

M3 - Conference contribution

AN - SCOPUS:85114886998

T3 - ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval

SP - 466

EP - 470

BT - ICMR 2021

T2 - 11th ACM International Conference on Multimedia Retrieval, ICMR 2021

Y2 - 16 November 2021 through 19 November 2021

ER -