Details
Original language | English |
---|---|
Title of host publication | Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007 |
Pages | 154-161 |
Number of pages | 8 |
Publication status | Published - 9 Jul 2007 |
Externally published | Yes |
Event | 6th ACM International Conference on Image and Video Retrieval, CIVR 2007 - Amsterdam, Netherlands Duration: 9 Jul 2007 → 11 Jul 2007 |
Publication series
Name | Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007 |
---|
Abstract
The automatic understanding of audiovisual content for multimedia retrieval is a difficult task, since the meaning respectively the appearance of a certain event or concept is strongly determined by contextual information. For example, the appearance of a high-level concept, such as e.g. maps or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In this paper, we show that it is possible to adaptively learn the appearance of certain objects or events for a particular test video utilizing unlabeled data in order to improve a subsequent retrieval process. First, an initial model is obtained via supervised learning using a set of appropriate training videos. Then, this initial model is used to rank shots for each test video v separately. This ranking is used to label the most relevant and most irrelevant shots in a video v for subsequent use as training data in a semi-supervised learning process. Based on these automatically labeled training data, relevant features are selected for the concept under consideration for video v. Then, two additional classifiers are trained on the automatically labeled data of this video. Adaboost and Support Vector Machines (SVM) are incorporated for feature selection and ensemble classification. Finally, the newly trained classifiers and the initial model form an ensemble. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed learning scheme for certain high-level concepts.
Keywords
- Semantic video retrieval, Semi-supervised learning
ASJC Scopus subject areas
- Engineering(all)
- Electrical and Electronic Engineering
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007. 2007. p. 154-161 (Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Semi-supervised learning for semantic video retrieval
AU - Ewerth, Ralph
AU - Freisleben, Bernd
PY - 2007/7/9
Y1 - 2007/7/9
N2 - The automatic understanding of audiovisual content for multimedia retrieval is a difficult task, since the meaning respectively the appearance of a certain event or concept is strongly determined by contextual information. For example, the appearance of a high-level concept, such as e.g. maps or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In this paper, we show that it is possible to adaptively learn the appearance of certain objects or events for a particular test video utilizing unlabeled data in order to improve a subsequent retrieval process. First, an initial model is obtained via supervised learning using a set of appropriate training videos. Then, this initial model is used to rank shots for each test video v separately. This ranking is used to label the most relevant and most irrelevant shots in a video v for subsequent use as training data in a semi-supervised learning process. Based on these automatically labeled training data, relevant features are selected for the concept under consideration for video v. Then, two additional classifiers are trained on the automatically labeled data of this video. Adaboost and Support Vector Machines (SVM) are incorporated for feature selection and ensemble classification. Finally, the newly trained classifiers and the initial model form an ensemble. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed learning scheme for certain high-level concepts.
AB - The automatic understanding of audiovisual content for multimedia retrieval is a difficult task, since the meaning respectively the appearance of a certain event or concept is strongly determined by contextual information. For example, the appearance of a high-level concept, such as e.g. maps or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In this paper, we show that it is possible to adaptively learn the appearance of certain objects or events for a particular test video utilizing unlabeled data in order to improve a subsequent retrieval process. First, an initial model is obtained via supervised learning using a set of appropriate training videos. Then, this initial model is used to rank shots for each test video v separately. This ranking is used to label the most relevant and most irrelevant shots in a video v for subsequent use as training data in a semi-supervised learning process. Based on these automatically labeled training data, relevant features are selected for the concept under consideration for video v. Then, two additional classifiers are trained on the automatically labeled data of this video. Adaboost and Support Vector Machines (SVM) are incorporated for feature selection and ensemble classification. Finally, the newly trained classifiers and the initial model form an ensemble. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed learning scheme for certain high-level concepts.
KW - Semantic video retrieval
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=36849060178&partnerID=8YFLogxK
U2 - 10.1145/1282280.1282308
DO - 10.1145/1282280.1282308
M3 - Conference contribution
AN - SCOPUS:36849060178
SN - 1595937331
SN - 9781595937339
T3 - Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007
SP - 154
EP - 161
BT - Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007
T2 - 6th ACM International Conference on Image and Video Retrieval, CIVR 2007
Y2 - 9 July 2007 through 11 July 2007
ER -