University of Marburg at TRECVID 2009: High-level feature extraction

Publikation: KonferenzbeitragPaperForschungPeer-Review

Autoren

  • Markus Mühling
  • Ralph Ewerth
  • Thilo Stadelmann
  • Bing Shi
  • Bernd Freisleben

Externe Organisationen

  • Universität Siegen
  • Philipps-Universität Marburg
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
PublikationsstatusVeröffentlicht - 2009
Extern publiziertJa
VeranstaltungTREC Video Retrieval Evaluation, TRECVID 2009 - Gaithersburg, MD, USA / Vereinigte Staaten
Dauer: 16 Nov. 200917 Nov. 2009

Konferenz

KonferenzTREC Video Retrieval Evaluation, TRECVID 2009
Land/GebietUSA / Vereinigte Staaten
OrtGaithersburg, MD
Zeitraum16 Nov. 200917 Nov. 2009

Abstract

In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2009. Our last year's high-level feature extraction system relied on low-level features as well as on state-of-theart approaches for camera motion estimation, text detection, face detection and audio segmentation. Based on the observation that the use of face detection results improved the performance of several face related concepts, we have incorporated further specialized object detectors. Using specialized object detectors trained on separate public data sets, objectbased features are generated by assembling detection results to object sequences. A shot-based confidence score and additional features, such as position, frame coverage and movement, are computed for each object class. The object detectors are used for two purposes: (a) to provide retrieval results for concepts directly related to the object class (such as using the boat detector for the concept boat), (b) to provide objectbased features as additional input for the SVM-based concept classifiers. Thus, other related concepts can also profit from object-based features. Furthermore, we investigated the use of SURF (Speeded Up Robust Features). The use of object-based features improved the high-level feature extraction results significantly. Our best run achieved a mean inferred average precision of 9.53%.

ASJC Scopus Sachgebiete

Zitieren

University of Marburg at TRECVID 2009: High-level feature extraction. / Mühling, Markus; Ewerth, Ralph; Stadelmann, Thilo et al.
2009. Beitrag in TREC Video Retrieval Evaluation, TRECVID 2009, Gaithersburg, MD, USA / Vereinigte Staaten.

Publikation: KonferenzbeitragPaperForschungPeer-Review

Mühling, M, Ewerth, R, Stadelmann, T, Shi, B & Freisleben, B 2009, 'University of Marburg at TRECVID 2009: High-level feature extraction', Beitrag in TREC Video Retrieval Evaluation, TRECVID 2009, Gaithersburg, MD, USA / Vereinigte Staaten, 16 Nov. 2009 - 17 Nov. 2009.
Mühling, M., Ewerth, R., Stadelmann, T., Shi, B., & Freisleben, B. (2009). University of Marburg at TRECVID 2009: High-level feature extraction. Beitrag in TREC Video Retrieval Evaluation, TRECVID 2009, Gaithersburg, MD, USA / Vereinigte Staaten.
Mühling M, Ewerth R, Stadelmann T, Shi B, Freisleben B. University of Marburg at TRECVID 2009: High-level feature extraction. 2009. Beitrag in TREC Video Retrieval Evaluation, TRECVID 2009, Gaithersburg, MD, USA / Vereinigte Staaten.
Mühling, Markus ; Ewerth, Ralph ; Stadelmann, Thilo et al. / University of Marburg at TRECVID 2009 : High-level feature extraction. Beitrag in TREC Video Retrieval Evaluation, TRECVID 2009, Gaithersburg, MD, USA / Vereinigte Staaten.
Download
@conference{3aa8f35efd8b4f9fbcddc2eab7ac3542,
title = "University of Marburg at TRECVID 2009: High-level feature extraction",
abstract = "In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2009. Our last year's high-level feature extraction system relied on low-level features as well as on state-of-theart approaches for camera motion estimation, text detection, face detection and audio segmentation. Based on the observation that the use of face detection results improved the performance of several face related concepts, we have incorporated further specialized object detectors. Using specialized object detectors trained on separate public data sets, objectbased features are generated by assembling detection results to object sequences. A shot-based confidence score and additional features, such as position, frame coverage and movement, are computed for each object class. The object detectors are used for two purposes: (a) to provide retrieval results for concepts directly related to the object class (such as using the boat detector for the concept boat), (b) to provide objectbased features as additional input for the SVM-based concept classifiers. Thus, other related concepts can also profit from object-based features. Furthermore, we investigated the use of SURF (Speeded Up Robust Features). The use of object-based features improved the high-level feature extraction results significantly. Our best run achieved a mean inferred average precision of 9.53%.",
author = "Markus M{\"u}hling and Ralph Ewerth and Thilo Stadelmann and Bing Shi and Bernd Freisleben",
year = "2009",
language = "English",
note = "TREC Video Retrieval Evaluation, TRECVID 2009 ; Conference date: 16-11-2009 Through 17-11-2009",

}

Download

TY - CONF

T1 - University of Marburg at TRECVID 2009

T2 - TREC Video Retrieval Evaluation, TRECVID 2009

AU - Mühling, Markus

AU - Ewerth, Ralph

AU - Stadelmann, Thilo

AU - Shi, Bing

AU - Freisleben, Bernd

PY - 2009

Y1 - 2009

N2 - In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2009. Our last year's high-level feature extraction system relied on low-level features as well as on state-of-theart approaches for camera motion estimation, text detection, face detection and audio segmentation. Based on the observation that the use of face detection results improved the performance of several face related concepts, we have incorporated further specialized object detectors. Using specialized object detectors trained on separate public data sets, objectbased features are generated by assembling detection results to object sequences. A shot-based confidence score and additional features, such as position, frame coverage and movement, are computed for each object class. The object detectors are used for two purposes: (a) to provide retrieval results for concepts directly related to the object class (such as using the boat detector for the concept boat), (b) to provide objectbased features as additional input for the SVM-based concept classifiers. Thus, other related concepts can also profit from object-based features. Furthermore, we investigated the use of SURF (Speeded Up Robust Features). The use of object-based features improved the high-level feature extraction results significantly. Our best run achieved a mean inferred average precision of 9.53%.

AB - In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2009. Our last year's high-level feature extraction system relied on low-level features as well as on state-of-theart approaches for camera motion estimation, text detection, face detection and audio segmentation. Based on the observation that the use of face detection results improved the performance of several face related concepts, we have incorporated further specialized object detectors. Using specialized object detectors trained on separate public data sets, objectbased features are generated by assembling detection results to object sequences. A shot-based confidence score and additional features, such as position, frame coverage and movement, are computed for each object class. The object detectors are used for two purposes: (a) to provide retrieval results for concepts directly related to the object class (such as using the boat detector for the concept boat), (b) to provide objectbased features as additional input for the SVM-based concept classifiers. Thus, other related concepts can also profit from object-based features. Furthermore, we investigated the use of SURF (Speeded Up Robust Features). The use of object-based features improved the high-level feature extraction results significantly. Our best run achieved a mean inferred average precision of 9.53%.

UR - http://www.scopus.com/inward/record.url?scp=84905686331&partnerID=8YFLogxK

M3 - Paper

Y2 - 16 November 2009 through 17 November 2009

ER -