Details
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 22169-22194 |
Seitenumfang | 26 |
Fachzeitschrift | Multimedia tools and applications |
Jahrgang | 76 |
Ausgabenummer | 21 |
Publikationsstatus | Veröffentlicht - 5 Juli 2017 |
Abstract
While digitization has changed the workflow of professional media production, the content-based labeling of image sequences and video footage, necessary for all subsequent stages of film and television production, archival or marketing is typically still performed manually and thus quite time-consuming. In this paper, we present deep learning approaches to support professional media production. In particular, novel algorithms for visual concept detection, similarity search, face detection, face recognition and face clustering are combined in a multimedia tool for effective video inspection and retrieval. The analysis algorithms for concept detection and similarity search are combined in a multi-task learning approach to share network weights, saving almost half of the computation time. Furthermore, a new visual concept lexicon tailored to fast video retrieval for media production and novel visualization components are introduced. Experimental results show the quality of the proposed approaches. For example, concept detection achieves a mean average precision of approximately 90% on the top-100 video shots, and face recognition clearly outperforms the baseline on the public Movie Trailers Face Dataset.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Ingenieurwesen (insg.)
- Medientechnik
- Informatik (insg.)
- Hardware und Architektur
- Informatik (insg.)
- Computernetzwerke und -kommunikation
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: Multimedia tools and applications, Jahrgang 76, Nr. 21, 05.07.2017, S. 22169-22194.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Deep learning for content-based video retrieval in film and television production
AU - Mühling, Markus
AU - Korfhage, Nikolaus
AU - Otto, Christian
AU - Springstein, Matthias
AU - Langelage, Thomas
AU - Veith, Uli
AU - Ewerth, Ralph
AU - Freisleben, Bernd
AU - Müller-Budack, Eric
N1 - Funding information: This work is financially supported by the German Federal Ministry for Economic Affairs and Energy (BMWi) in the ZIM Programme.
PY - 2017/7/5
Y1 - 2017/7/5
N2 - While digitization has changed the workflow of professional media production, the content-based labeling of image sequences and video footage, necessary for all subsequent stages of film and television production, archival or marketing is typically still performed manually and thus quite time-consuming. In this paper, we present deep learning approaches to support professional media production. In particular, novel algorithms for visual concept detection, similarity search, face detection, face recognition and face clustering are combined in a multimedia tool for effective video inspection and retrieval. The analysis algorithms for concept detection and similarity search are combined in a multi-task learning approach to share network weights, saving almost half of the computation time. Furthermore, a new visual concept lexicon tailored to fast video retrieval for media production and novel visualization components are introduced. Experimental results show the quality of the proposed approaches. For example, concept detection achieves a mean average precision of approximately 90% on the top-100 video shots, and face recognition clearly outperforms the baseline on the public Movie Trailers Face Dataset.
AB - While digitization has changed the workflow of professional media production, the content-based labeling of image sequences and video footage, necessary for all subsequent stages of film and television production, archival or marketing is typically still performed manually and thus quite time-consuming. In this paper, we present deep learning approaches to support professional media production. In particular, novel algorithms for visual concept detection, similarity search, face detection, face recognition and face clustering are combined in a multimedia tool for effective video inspection and retrieval. The analysis algorithms for concept detection and similarity search are combined in a multi-task learning approach to share network weights, saving almost half of the computation time. Furthermore, a new visual concept lexicon tailored to fast video retrieval for media production and novel visualization components are introduced. Experimental results show the quality of the proposed approaches. For example, concept detection achieves a mean average precision of approximately 90% on the top-100 video shots, and face recognition clearly outperforms the baseline on the public Movie Trailers Face Dataset.
KW - Deep learning
KW - Face recognition
KW - Image and video analysis
KW - Media production
KW - Similarity search
KW - Visual concept detection
UR - http://www.scopus.com/inward/record.url?scp=85021790014&partnerID=8YFLogxK
U2 - 10.1007/s11042-017-4962-9
DO - 10.1007/s11042-017-4962-9
M3 - Article
AN - SCOPUS:85021790014
VL - 76
SP - 22169
EP - 22194
JO - Multimedia tools and applications
JF - Multimedia tools and applications
SN - 1380-7501
IS - 21
ER -