Details
Original language | English |
---|---|
Pages (from-to) | 22169-22194 |
Number of pages | 26 |
Journal | Multimedia tools and applications |
Volume | 76 |
Issue number | 21 |
Publication status | Published - 5 Jul 2017 |
Abstract
While digitization has changed the workflow of professional media production, the content-based labeling of image sequences and video footage, necessary for all subsequent stages of film and television production, archival or marketing is typically still performed manually and thus quite time-consuming. In this paper, we present deep learning approaches to support professional media production. In particular, novel algorithms for visual concept detection, similarity search, face detection, face recognition and face clustering are combined in a multimedia tool for effective video inspection and retrieval. The analysis algorithms for concept detection and similarity search are combined in a multi-task learning approach to share network weights, saving almost half of the computation time. Furthermore, a new visual concept lexicon tailored to fast video retrieval for media production and novel visualization components are introduced. Experimental results show the quality of the proposed approaches. For example, concept detection achieves a mean average precision of approximately 90% on the top-100 video shots, and face recognition clearly outperforms the baseline on the public Movie Trailers Face Dataset.
Keywords
- Deep learning, Face recognition, Image and video analysis, Media production, Similarity search, Visual concept detection
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Engineering(all)
- Media Technology
- Computer Science(all)
- Hardware and Architecture
- Computer Science(all)
- Computer Networks and Communications
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Multimedia tools and applications, Vol. 76, No. 21, 05.07.2017, p. 22169-22194.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Deep learning for content-based video retrieval in film and television production
AU - Mühling, Markus
AU - Korfhage, Nikolaus
AU - Otto, Christian
AU - Springstein, Matthias
AU - Langelage, Thomas
AU - Veith, Uli
AU - Ewerth, Ralph
AU - Freisleben, Bernd
AU - Müller-Budack, Eric
N1 - Funding information: This work is financially supported by the German Federal Ministry for Economic Affairs and Energy (BMWi) in the ZIM Programme.
PY - 2017/7/5
Y1 - 2017/7/5
N2 - While digitization has changed the workflow of professional media production, the content-based labeling of image sequences and video footage, necessary for all subsequent stages of film and television production, archival or marketing is typically still performed manually and thus quite time-consuming. In this paper, we present deep learning approaches to support professional media production. In particular, novel algorithms for visual concept detection, similarity search, face detection, face recognition and face clustering are combined in a multimedia tool for effective video inspection and retrieval. The analysis algorithms for concept detection and similarity search are combined in a multi-task learning approach to share network weights, saving almost half of the computation time. Furthermore, a new visual concept lexicon tailored to fast video retrieval for media production and novel visualization components are introduced. Experimental results show the quality of the proposed approaches. For example, concept detection achieves a mean average precision of approximately 90% on the top-100 video shots, and face recognition clearly outperforms the baseline on the public Movie Trailers Face Dataset.
AB - While digitization has changed the workflow of professional media production, the content-based labeling of image sequences and video footage, necessary for all subsequent stages of film and television production, archival or marketing is typically still performed manually and thus quite time-consuming. In this paper, we present deep learning approaches to support professional media production. In particular, novel algorithms for visual concept detection, similarity search, face detection, face recognition and face clustering are combined in a multimedia tool for effective video inspection and retrieval. The analysis algorithms for concept detection and similarity search are combined in a multi-task learning approach to share network weights, saving almost half of the computation time. Furthermore, a new visual concept lexicon tailored to fast video retrieval for media production and novel visualization components are introduced. Experimental results show the quality of the proposed approaches. For example, concept detection achieves a mean average precision of approximately 90% on the top-100 video shots, and face recognition clearly outperforms the baseline on the public Movie Trailers Face Dataset.
KW - Deep learning
KW - Face recognition
KW - Image and video analysis
KW - Media production
KW - Similarity search
KW - Visual concept detection
UR - http://www.scopus.com/inward/record.url?scp=85021790014&partnerID=8YFLogxK
U2 - 10.1007/s11042-017-4962-9
DO - 10.1007/s11042-017-4962-9
M3 - Article
AN - SCOPUS:85021790014
VL - 76
SP - 22169
EP - 22194
JO - Multimedia tools and applications
JF - Multimedia tools and applications
SN - 1380-7501
IS - 21
ER -