Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Kader Pustu-Iren
  • Markus Mühling
  • Nikolaus Korfhage
  • Joanna Bars
  • Sabrina Bernhöft
  • Angelika Hörth
  • Bernd Freisleben
  • Ralph Ewerth

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
  • Philipps-Universität Marburg
  • German Broadcasting Archive (DRA)
View graph of relations

Details

Original languageEnglish
Title of host publicationDigital Libraries for Open Knowledge
Subtitle of host publication23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings
EditorsAntoine Doucet, Antoine Isaac, Koraljka Golub, Trond Aalberg, Adam Jatowt
PublisherSpringer Verlag
Pages107-114
Number of pages8
Edition1.
ISBN (electronic)978-3-030-30760-8
ISBN (print)978-3-030-30759-2
Publication statusPublished - 30 Aug 2019
Event23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019 - Oslo, Norway
Duration: 9 Sept 201912 Sept 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11799 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Video indexing approaches such as visual concept classification and person recognition are essential to enable fine-grained semantic search in large-scale video archives such as the historical video collection of the former German Democratic Republic (GDR) maintained by the German Broadcasting Archive (DRA). Typically, a lexicon of visual concepts has to be defined for semantic search. But the definition of visual concepts can be more or less subjective due to individually differing judgments of annotators, which may have an impact on training data quality for supervised machine learning methods. In this paper, we analyze the inter-coder agreement on historical TV data of the former GDR for visual concept classification and person recognition. The inter-coder agreement is evaluated for a group of expert as well as non-expert annotators. Furthermore, correlations between visual recognition performance and inter-annotator agreement are measured. In this context, information about training dataset size and agreement are used to predict average precision for concept classification. Finally, the impact of expert vs. non-expert annotations on person recognition is analyzed.

Keywords

    Historical videos, Inter-coder agreement, Performance prediction, Person identification, Visual concept classification

ASJC Scopus subject areas

Cite this

Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data. / Pustu-Iren, Kader; Mühling, Markus; Korfhage, Nikolaus et al.
Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings. ed. / Antoine Doucet; Antoine Isaac; Koraljka Golub; Trond Aalberg; Adam Jatowt. 1. ed. Springer Verlag, 2019. p. 107-114 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11799 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Pustu-Iren, K, Mühling, M, Korfhage, N, Bars, J, Bernhöft, S, Hörth, A, Freisleben, B & Ewerth, R 2019, Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data. in A Doucet, A Isaac, K Golub, T Aalberg & A Jatowt (eds), Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings. 1. edn, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11799 LNCS, Springer Verlag, pp. 107-114, 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Oslo, Norway, 9 Sept 2019. https://doi.org/10.48550/arXiv.1907.10450, https://doi.org/10.1007/978-3-030-30760-8_9
Pustu-Iren, K., Mühling, M., Korfhage, N., Bars, J., Bernhöft, S., Hörth, A., Freisleben, B., & Ewerth, R. (2019). Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data. In A. Doucet, A. Isaac, K. Golub, T. Aalberg, & A. Jatowt (Eds.), Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings (1. ed., pp. 107-114). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11799 LNCS). Springer Verlag. https://doi.org/10.48550/arXiv.1907.10450, https://doi.org/10.1007/978-3-030-30760-8_9
Pustu-Iren K, Mühling M, Korfhage N, Bars J, Bernhöft S, Hörth A et al. Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data. In Doucet A, Isaac A, Golub K, Aalberg T, Jatowt A, editors, Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings. 1. ed. Springer Verlag. 2019. p. 107-114. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.48550/arXiv.1907.10450, 10.1007/978-3-030-30760-8_9
Pustu-Iren, Kader ; Mühling, Markus ; Korfhage, Nikolaus et al. / Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data. Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings. editor / Antoine Doucet ; Antoine Isaac ; Koraljka Golub ; Trond Aalberg ; Adam Jatowt. 1. ed. Springer Verlag, 2019. pp. 107-114 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{7d5a62372b7743b687dcbd8442f1e5e3,
title = "Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data",
abstract = "Video indexing approaches such as visual concept classification and person recognition are essential to enable fine-grained semantic search in large-scale video archives such as the historical video collection of the former German Democratic Republic (GDR) maintained by the German Broadcasting Archive (DRA). Typically, a lexicon of visual concepts has to be defined for semantic search. But the definition of visual concepts can be more or less subjective due to individually differing judgments of annotators, which may have an impact on training data quality for supervised machine learning methods. In this paper, we analyze the inter-coder agreement on historical TV data of the former GDR for visual concept classification and person recognition. The inter-coder agreement is evaluated for a group of expert as well as non-expert annotators. Furthermore, correlations between visual recognition performance and inter-annotator agreement are measured. In this context, information about training dataset size and agreement are used to predict average precision for concept classification. Finally, the impact of expert vs. non-expert annotations on person recognition is analyzed.",
keywords = "Historical videos, Inter-coder agreement, Performance prediction, Person identification, Visual concept classification",
author = "Kader Pustu-Iren and Markus M{\"u}hling and Nikolaus Korfhage and Joanna Bars and Sabrina Bernh{\"o}ft and Angelika H{\"o}rth and Bernd Freisleben and Ralph Ewerth",
note = "Funding information: This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)-project number 388420599.; 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019 ; Conference date: 09-09-2019 Through 12-09-2019",
year = "2019",
month = aug,
day = "30",
doi = "10.48550/arXiv.1907.10450",
language = "English",
isbn = "978-3-030-30759-2",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "107--114",
editor = "Antoine Doucet and Antoine Isaac and Koraljka Golub and Trond Aalberg and Adam Jatowt",
booktitle = "Digital Libraries for Open Knowledge",
address = "Germany",
edition = "1.",

}

Download

TY - GEN

T1 - Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data

AU - Pustu-Iren, Kader

AU - Mühling, Markus

AU - Korfhage, Nikolaus

AU - Bars, Joanna

AU - Bernhöft, Sabrina

AU - Hörth, Angelika

AU - Freisleben, Bernd

AU - Ewerth, Ralph

N1 - Funding information: This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)-project number 388420599.

PY - 2019/8/30

Y1 - 2019/8/30

N2 - Video indexing approaches such as visual concept classification and person recognition are essential to enable fine-grained semantic search in large-scale video archives such as the historical video collection of the former German Democratic Republic (GDR) maintained by the German Broadcasting Archive (DRA). Typically, a lexicon of visual concepts has to be defined for semantic search. But the definition of visual concepts can be more or less subjective due to individually differing judgments of annotators, which may have an impact on training data quality for supervised machine learning methods. In this paper, we analyze the inter-coder agreement on historical TV data of the former GDR for visual concept classification and person recognition. The inter-coder agreement is evaluated for a group of expert as well as non-expert annotators. Furthermore, correlations between visual recognition performance and inter-annotator agreement are measured. In this context, information about training dataset size and agreement are used to predict average precision for concept classification. Finally, the impact of expert vs. non-expert annotations on person recognition is analyzed.

AB - Video indexing approaches such as visual concept classification and person recognition are essential to enable fine-grained semantic search in large-scale video archives such as the historical video collection of the former German Democratic Republic (GDR) maintained by the German Broadcasting Archive (DRA). Typically, a lexicon of visual concepts has to be defined for semantic search. But the definition of visual concepts can be more or less subjective due to individually differing judgments of annotators, which may have an impact on training data quality for supervised machine learning methods. In this paper, we analyze the inter-coder agreement on historical TV data of the former GDR for visual concept classification and person recognition. The inter-coder agreement is evaluated for a group of expert as well as non-expert annotators. Furthermore, correlations between visual recognition performance and inter-annotator agreement are measured. In this context, information about training dataset size and agreement are used to predict average precision for concept classification. Finally, the impact of expert vs. non-expert annotations on person recognition is analyzed.

KW - Historical videos

KW - Inter-coder agreement

KW - Performance prediction

KW - Person identification

KW - Visual concept classification

UR - http://www.scopus.com/inward/record.url?scp=85072871625&partnerID=8YFLogxK

U2 - 10.48550/arXiv.1907.10450

DO - 10.48550/arXiv.1907.10450

M3 - Conference contribution

AN - SCOPUS:85072871625

SN - 978-3-030-30759-2

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 107

EP - 114

BT - Digital Libraries for Open Knowledge

A2 - Doucet, Antoine

A2 - Isaac, Antoine

A2 - Golub, Koraljka

A2 - Aalberg, Trond

A2 - Jatowt, Adam

PB - Springer Verlag

T2 - 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019

Y2 - 9 September 2019 through 12 September 2019

ER -