Learning convolutional neural networks for object detection with very little training data

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandBeitrag in Buch/SammelwerkForschungPeer-Review

Autorschaft

Externe Organisationen

  • University of Twente
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksMultimodal Scene Understanding
UntertitelAlgorithms, Applications and Deep Learning
Herausgeber/-innenMichael Ying Yang, Bodo Rosenhahn, Vittorio Murino
Herausgeber (Verlag)Elsevier
Seiten65-100
Seitenumfang36
ISBN (elektronisch)9780128173589
PublikationsstatusVeröffentlicht - 2 Aug. 2019

Abstract

In recent years, convolutional neural networks have shown great success in various computer vision tasks such as classification, object detection, and scene analysis. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. The availability of sufficient data, however, limits possible applications. While large amounts of data can be quickly collected, supervised learning further requires labeled data. Labeling data, unfortunately, is usually very time-consuming and literally expensive. This chapter addresses the problem of learning with very little labeled data for extracting information about the infrastructure in urban areas. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. The presented system for object detection is trained with very few training examples. To achieve this, the advantages of convolutional neural networks and random forests are combined to learn a patch-wise classifier. In the next step, the random forest is mapped to a neural network and the classifier is transformed to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, GPS-data is integrated to localize the predictions on the map and multiple observations are merged to further improve the localization accuracy. In comparison to faster R-CNN and other networks for object detection or algorithms for transfer learning, the required amount of labeled data is considerably reduced.

ASJC Scopus Sachgebiete

Zitieren

Learning convolutional neural networks for object detection with very little training data. / Reinders, Christoph; Ackermann, Hanno; Yang, Michael Ying et al.
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Hrsg. / Michael Ying Yang; Bodo Rosenhahn; Vittorio Murino. Elsevier, 2019. S. 65-100.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandBeitrag in Buch/SammelwerkForschungPeer-Review

Reinders, C, Ackermann, H, Yang, MY & Rosenhahn, B 2019, Learning convolutional neural networks for object detection with very little training data. in M Ying Yang, B Rosenhahn & V Murino (Hrsg.), Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Elsevier, S. 65-100. https://doi.org/10.1016/b978-0-12-817358-9.00010-x
Reinders, C., Ackermann, H., Yang, M. Y., & Rosenhahn, B. (2019). Learning convolutional neural networks for object detection with very little training data. In M. Ying Yang, B. Rosenhahn, & V. Murino (Hrsg.), Multimodal Scene Understanding: Algorithms, Applications and Deep Learning (S. 65-100). Elsevier. https://doi.org/10.1016/b978-0-12-817358-9.00010-x
Reinders C, Ackermann H, Yang MY, Rosenhahn B. Learning convolutional neural networks for object detection with very little training data. in Ying Yang M, Rosenhahn B, Murino V, Hrsg., Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Elsevier. 2019. S. 65-100 doi: 10.1016/b978-0-12-817358-9.00010-x
Reinders, Christoph ; Ackermann, Hanno ; Yang, Michael Ying et al. / Learning convolutional neural networks for object detection with very little training data. Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Hrsg. / Michael Ying Yang ; Bodo Rosenhahn ; Vittorio Murino. Elsevier, 2019. S. 65-100
Download
@inbook{52c99bf849324dbeb958e05c7fb52560,
title = "Learning convolutional neural networks for object detection with very little training data",
abstract = "In recent years, convolutional neural networks have shown great success in various computer vision tasks such as classification, object detection, and scene analysis. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. The availability of sufficient data, however, limits possible applications. While large amounts of data can be quickly collected, supervised learning further requires labeled data. Labeling data, unfortunately, is usually very time-consuming and literally expensive. This chapter addresses the problem of learning with very little labeled data for extracting information about the infrastructure in urban areas. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. The presented system for object detection is trained with very few training examples. To achieve this, the advantages of convolutional neural networks and random forests are combined to learn a patch-wise classifier. In the next step, the random forest is mapped to a neural network and the classifier is transformed to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, GPS-data is integrated to localize the predictions on the map and multiple observations are merged to further improve the localization accuracy. In comparison to faster R-CNN and other networks for object detection or algorithms for transfer learning, the required amount of labeled data is considerably reduced.",
keywords = "Convolutional neural networks, Localization, Object detection, Random forests",
author = "Christoph Reinders and Hanno Ackermann and Yang, {Michael Ying} and Bodo Rosenhahn",
year = "2019",
month = aug,
day = "2",
doi = "10.1016/b978-0-12-817358-9.00010-x",
language = "English",
pages = "65--100",
editor = "{Ying Yang}, Michael and Bodo Rosenhahn and Vittorio Murino",
booktitle = "Multimodal Scene Understanding",
publisher = "Elsevier",
address = "Netherlands",

}

Download

TY - CHAP

T1 - Learning convolutional neural networks for object detection with very little training data

AU - Reinders, Christoph

AU - Ackermann, Hanno

AU - Yang, Michael Ying

AU - Rosenhahn, Bodo

PY - 2019/8/2

Y1 - 2019/8/2

N2 - In recent years, convolutional neural networks have shown great success in various computer vision tasks such as classification, object detection, and scene analysis. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. The availability of sufficient data, however, limits possible applications. While large amounts of data can be quickly collected, supervised learning further requires labeled data. Labeling data, unfortunately, is usually very time-consuming and literally expensive. This chapter addresses the problem of learning with very little labeled data for extracting information about the infrastructure in urban areas. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. The presented system for object detection is trained with very few training examples. To achieve this, the advantages of convolutional neural networks and random forests are combined to learn a patch-wise classifier. In the next step, the random forest is mapped to a neural network and the classifier is transformed to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, GPS-data is integrated to localize the predictions on the map and multiple observations are merged to further improve the localization accuracy. In comparison to faster R-CNN and other networks for object detection or algorithms for transfer learning, the required amount of labeled data is considerably reduced.

AB - In recent years, convolutional neural networks have shown great success in various computer vision tasks such as classification, object detection, and scene analysis. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. The availability of sufficient data, however, limits possible applications. While large amounts of data can be quickly collected, supervised learning further requires labeled data. Labeling data, unfortunately, is usually very time-consuming and literally expensive. This chapter addresses the problem of learning with very little labeled data for extracting information about the infrastructure in urban areas. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. The presented system for object detection is trained with very few training examples. To achieve this, the advantages of convolutional neural networks and random forests are combined to learn a patch-wise classifier. In the next step, the random forest is mapped to a neural network and the classifier is transformed to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, GPS-data is integrated to localize the predictions on the map and multiple observations are merged to further improve the localization accuracy. In comparison to faster R-CNN and other networks for object detection or algorithms for transfer learning, the required amount of labeled data is considerably reduced.

KW - Convolutional neural networks

KW - Localization

KW - Object detection

KW - Random forests

UR - http://www.scopus.com/inward/record.url?scp=85082047720&partnerID=8YFLogxK

U2 - 10.1016/b978-0-12-817358-9.00010-x

DO - 10.1016/b978-0-12-817358-9.00010-x

M3 - Contribution to book/anthology

AN - SCOPUS:85082047720

SP - 65

EP - 100

BT - Multimodal Scene Understanding

A2 - Ying Yang, Michael

A2 - Rosenhahn, Bodo

A2 - Murino, Vittorio

PB - Elsevier

ER -

Von denselben Autoren