Details
Original language | English |
---|---|
Title of host publication | 2008 TREC Video Retrieval Evaluation Notebook Papers and Slides |
Publication status | Published - 2008 |
Externally published | Yes |
Event | TREC Video Retrieval Evaluation, TRECVID 2008 - Gaithersburg, MD, United States Duration: 17 Nov 2008 → 18 Nov 2008 |
Abstract
In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2008. Our last year's high-level feature extraction system was based on low-level features as well as on state-of-the-art approaches for camera motion estimation, text detection, face detection and audio segmentation. This system served as a basis for our experiments this year and was extended in several ways. First, we paid attention to the fact that most of the concepts suffered from a small number of positive training samples while offering a huge number of negative ones. We tried to reduce this unbalance of positive and negative training samples by sub-sampling the negative instances. Furthermore, we increased the number of positive training samples by creating image variations. Both methods improved the detection results significantly, while the sub-sampling approach achieved our best result (8.27% mean inferred average precision). Second, we incorporated two further feature types: Hough features and audio low-level features. Finally, we supplemented our approach using cross-validation in order to improve the high level feature extraction results. On the one hand, we applied cross-validation for feature selection, on the other hand we tried to find the best sampling rate of negative instances for each concept.
ASJC Scopus subject areas
- Computer Science(all)
- Computer Graphics and Computer-Aided Design
- Computer Science(all)
- Computer Vision and Pattern Recognition
- Computer Science(all)
- Human-Computer Interaction
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2008 TREC Video Retrieval Evaluation Notebook Papers and Slides. 2008.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research
}
TY - GEN
T1 - University of Marburg at TRECVID 2008
T2 - TREC Video Retrieval Evaluation, TRECVID 2008
AU - Mühling, Markus
AU - Ewerth, Ralph
AU - Stadelmann, Thilo
AU - Shi, Bing
AU - Freisleben, Bernd
PY - 2008
Y1 - 2008
N2 - In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2008. Our last year's high-level feature extraction system was based on low-level features as well as on state-of-the-art approaches for camera motion estimation, text detection, face detection and audio segmentation. This system served as a basis for our experiments this year and was extended in several ways. First, we paid attention to the fact that most of the concepts suffered from a small number of positive training samples while offering a huge number of negative ones. We tried to reduce this unbalance of positive and negative training samples by sub-sampling the negative instances. Furthermore, we increased the number of positive training samples by creating image variations. Both methods improved the detection results significantly, while the sub-sampling approach achieved our best result (8.27% mean inferred average precision). Second, we incorporated two further feature types: Hough features and audio low-level features. Finally, we supplemented our approach using cross-validation in order to improve the high level feature extraction results. On the one hand, we applied cross-validation for feature selection, on the other hand we tried to find the best sampling rate of negative instances for each concept.
AB - In this paper, we summarize our results for the high-level feature extraction task at TRECVID 2008. Our last year's high-level feature extraction system was based on low-level features as well as on state-of-the-art approaches for camera motion estimation, text detection, face detection and audio segmentation. This system served as a basis for our experiments this year and was extended in several ways. First, we paid attention to the fact that most of the concepts suffered from a small number of positive training samples while offering a huge number of negative ones. We tried to reduce this unbalance of positive and negative training samples by sub-sampling the negative instances. Furthermore, we increased the number of positive training samples by creating image variations. Both methods improved the detection results significantly, while the sub-sampling approach achieved our best result (8.27% mean inferred average precision). Second, we incorporated two further feature types: Hough features and audio low-level features. Finally, we supplemented our approach using cross-validation in order to improve the high level feature extraction results. On the one hand, we applied cross-validation for feature selection, on the other hand we tried to find the best sampling rate of negative instances for each concept.
UR - http://www.scopus.com/inward/record.url?scp=84905178027&partnerID=8YFLogxK
M3 - Conference contribution
BT - 2008 TREC Video Retrieval Evaluation Notebook Papers and Slides
Y2 - 17 November 2008 through 18 November 2008
ER -