SlideImages: A dataset for educational image classification

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • David Morris
  • Eric Müller-Budack
  • Ralph Ewerth

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationAdvances in Information Retrieval
Subtitle of host publication42nd European Conference on IR Research, ECIR 2020, Proceedings
EditorsJoemon M. Jose, Emine Yilmaz, João Magalhães, Flávio Martins, Pablo Castells, Nicola Ferro, Mário J. Silva
Place of PublicationCham
Pages289-296
Number of pages8
ISBN (electronic)978-3-030-45442-5
Publication statusPublished - 8 Apr 2020
Event42nd European Conference on IR Research, ECIR 2020 - Lisbon, Portugal
Duration: 14 Apr 202017 Apr 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12036 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.

Keywords

    Classification dataset, Document figure classification, Educational documents

ASJC Scopus subject areas

Cite this

SlideImages: A dataset for educational image classification. / Morris, David; Müller-Budack, Eric; Ewerth, Ralph.
Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Proceedings. ed. / Joemon M. Jose; Emine Yilmaz; João Magalhães; Flávio Martins; Pablo Castells; Nicola Ferro; Mário J. Silva. Cham, 2020. p. 289-296 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12036 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Morris, D, Müller-Budack, E & Ewerth, R 2020, SlideImages: A dataset for educational image classification. in JM Jose, E Yilmaz, J Magalhães, F Martins, P Castells, N Ferro & MJ Silva (eds), Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12036 LNCS, Cham, pp. 289-296, 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14 Apr 2020. https://doi.org/10.1007/978-3-030-45442-5_36
Morris, D., Müller-Budack, E., & Ewerth, R. (2020). SlideImages: A dataset for educational image classification. In J. M. Jose, E. Yilmaz, J. Magalhães, F. Martins, P. Castells, N. Ferro, & M. J. Silva (Eds.), Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Proceedings (pp. 289-296). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12036 LNCS).. https://doi.org/10.1007/978-3-030-45442-5_36
Morris D, Müller-Budack E, Ewerth R. SlideImages: A dataset for educational image classification. In Jose JM, Yilmaz E, Magalhães J, Martins F, Castells P, Ferro N, Silva MJ, editors, Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Proceedings. Cham. 2020. p. 289-296. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-45442-5_36
Morris, David ; Müller-Budack, Eric ; Ewerth, Ralph. / SlideImages : A dataset for educational image classification. Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Proceedings. editor / Joemon M. Jose ; Emine Yilmaz ; João Magalhães ; Flávio Martins ; Pablo Castells ; Nicola Ferro ; Mário J. Silva. Cham, 2020. pp. 289-296 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{301aba64716f4d3a8338f26c68c4157b,
title = "SlideImages: A dataset for educational image classification",
abstract = "In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.",
keywords = "Classification dataset, Document figure classification, Educational documents",
author = "David Morris and Eric M{\"u}ller-Budack and Ralph Ewerth",
note = "Funding information: Acknowledgement. This work is financially supported by the German Federal Ministry of Education and Research (BMBF) and European Social Fund (ESF) (Inclu-siveOCW project, no. 01PE17004).; 42nd European Conference on IR Research, ECIR 2020 ; Conference date: 14-04-2020 Through 17-04-2020",
year = "2020",
month = apr,
day = "8",
doi = "10.1007/978-3-030-45442-5_36",
language = "English",
isbn = "9783030454418",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "289--296",
editor = "Jose, {Joemon M.} and Emine Yilmaz and Jo{\~a}o Magalh{\~a}es and Fl{\'a}vio Martins and Pablo Castells and Nicola Ferro and Silva, {M{\'a}rio J.}",
booktitle = "Advances in Information Retrieval",

}

Download

TY - GEN

T1 - SlideImages

T2 - 42nd European Conference on IR Research, ECIR 2020

AU - Morris, David

AU - Müller-Budack, Eric

AU - Ewerth, Ralph

N1 - Funding information: Acknowledgement. This work is financially supported by the German Federal Ministry of Education and Research (BMBF) and European Social Fund (ESF) (Inclu-siveOCW project, no. 01PE17004).

PY - 2020/4/8

Y1 - 2020/4/8

N2 - In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.

AB - In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.

KW - Classification dataset

KW - Document figure classification

KW - Educational documents

UR - http://www.scopus.com/inward/record.url?scp=85084183613&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-45442-5_36

DO - 10.1007/978-3-030-45442-5_36

M3 - Conference contribution

AN - SCOPUS:85084183613

SN - 9783030454418

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 289

EP - 296

BT - Advances in Information Retrieval

A2 - Jose, Joemon M.

A2 - Yilmaz, Emine

A2 - Magalhães, João

A2 - Martins, Flávio

A2 - Castells, Pablo

A2 - Ferro, Nicola

A2 - Silva, Mário J.

CY - Cham

Y2 - 14 April 2020 through 17 April 2020

ER -