Boosted unsupervised multi-source selection for domain adaption

Publikation: Beitrag in FachzeitschriftKonferenzaufsatz in FachzeitschriftForschungPeer-Review

Autoren

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)229-236
Seitenumfang8
FachzeitschriftISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Jahrgang4
Ausgabenummer1W1
PublikationsstatusVeröffentlicht - 30 Mai 2017
VeranstaltungISPRS Hannover Workshop 2017: HRIGI - High-Resolution Earth Imaging for Geospatial Information, CMRT - City Models, Roads and Traffic, ISA - Image Sequence Analysis, EuroCOW - European Calibration and Orientation Workshop - Hannover, Hannover, Deutschland
Dauer: 6 Juni 20179 Juni 2017

Abstract

Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.

ASJC Scopus Sachgebiete

Zitieren

Boosted unsupervised multi-source selection for domain adaption. / Vogt, K.; Paul, A.; Ostermann, J. et al.
in: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Jahrgang 4, Nr. 1W1, 30.05.2017, S. 229-236.

Publikation: Beitrag in FachzeitschriftKonferenzaufsatz in FachzeitschriftForschungPeer-Review

Vogt, K, Paul, A, Ostermann, J, Rottensteiner, F & Heipke, C 2017, 'Boosted unsupervised multi-source selection for domain adaption', ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Jg. 4, Nr. 1W1, S. 229-236. https://doi.org/10.5194/isprs-annals-IV-1-W1-229-2017
Vogt, K., Paul, A., Ostermann, J., Rottensteiner, F., & Heipke, C. (2017). Boosted unsupervised multi-source selection for domain adaption. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4(1W1), 229-236. https://doi.org/10.5194/isprs-annals-IV-1-W1-229-2017
Vogt K, Paul A, Ostermann J, Rottensteiner F, Heipke C. Boosted unsupervised multi-source selection for domain adaption. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2017 Mai 30;4(1W1):229-236. doi: 10.5194/isprs-annals-IV-1-W1-229-2017
Vogt, K. ; Paul, A. ; Ostermann, J. et al. / Boosted unsupervised multi-source selection for domain adaption. in: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2017 ; Jahrgang 4, Nr. 1W1. S. 229-236.
Download
@article{abc0858b8eb9409da96ec9d5c6d9ddc1,
title = "Boosted unsupervised multi-source selection for domain adaption",
abstract = "Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.",
keywords = "Domain Adaptation, Machine Learning, Negative Transfer, Remote Sensing, Source Selection, Transfer Learning",
author = "K. Vogt and A. Paul and J. Ostermann and F. Rottensteiner and C. Heipke",
note = "Funding information: This work was supported by the German Science Foundation (DFG) under grants OS 295/4-1 and HE 1822/30-1 and in the context of the research training group GRK2159 (i.c.sens). The Vaihingen data set was provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) (Cramer, 2010): http://www.ifp.uni-stuttgart.de/dgpf/DKEP-Allg.html.; ISPRS Hannover Workshop 2017 on High-Resolution Earth Imaging for Geospatial Information, HRIGI 2017, City Models, Roads and Traffic , CMRT 2017, Image Sequence Analysis, ISA 2017, European Calibration and Orientation Workshop, EuroCOW 2017 ; Conference date: 06-06-2017 Through 09-06-2017",
year = "2017",
month = may,
day = "30",
doi = "10.5194/isprs-annals-IV-1-W1-229-2017",
language = "English",
volume = "4",
pages = "229--236",
number = "1W1",

}

Download

TY - JOUR

T1 - Boosted unsupervised multi-source selection for domain adaption

AU - Vogt, K.

AU - Paul, A.

AU - Ostermann, J.

AU - Rottensteiner, F.

AU - Heipke, C.

N1 - Funding information: This work was supported by the German Science Foundation (DFG) under grants OS 295/4-1 and HE 1822/30-1 and in the context of the research training group GRK2159 (i.c.sens). The Vaihingen data set was provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) (Cramer, 2010): http://www.ifp.uni-stuttgart.de/dgpf/DKEP-Allg.html.

PY - 2017/5/30

Y1 - 2017/5/30

N2 - Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.

AB - Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.

KW - Domain Adaptation

KW - Machine Learning

KW - Negative Transfer

KW - Remote Sensing

KW - Source Selection

KW - Transfer Learning

UR - http://www.scopus.com/inward/record.url?scp=85026639000&partnerID=8YFLogxK

U2 - 10.5194/isprs-annals-IV-1-W1-229-2017

DO - 10.5194/isprs-annals-IV-1-W1-229-2017

M3 - Conference article

AN - SCOPUS:85026639000

VL - 4

SP - 229

EP - 236

JO - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences

JF - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences

SN - 2194-9042

IS - 1W1

T2 - ISPRS Hannover Workshop 2017 on High-Resolution Earth Imaging for Geospatial Information, HRIGI 2017, City Models, Roads and Traffic , CMRT 2017, Image Sequence Analysis, ISA 2017, European Calibration and Orientation Workshop, EuroCOW 2017

Y2 - 6 June 2017 through 9 June 2017

ER -

Von denselben Autoren