Details
Original language | English |
---|---|
Pages (from-to) | 229-236 |
Number of pages | 8 |
Journal | ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
Volume | 4 |
Issue number | 1W1 |
Publication status | Published - 30 May 2017 |
Event | ISPRS Hannover Workshop 2017 on High-Resolution Earth Imaging for Geospatial Information, HRIGI 2017, City Models, Roads and Traffic , CMRT 2017, Image Sequence Analysis, ISA 2017, European Calibration and Orientation Workshop, EuroCOW 2017: HRIGI - High-Resolution Earth Imaging for Geospatial Information, CMRT - City Models, Roads and Traffic, ISA - Image Sequence Analysis, EuroCOW - European Calibration and Orientation Workshop - Hannover, Hannover, Germany Duration: 6 Jun 2017 → 9 Jun 2017 |
Abstract
Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.
Keywords
- Domain Adaptation, Machine Learning, Negative Transfer, Remote Sensing, Source Selection, Transfer Learning
ASJC Scopus subject areas
- Earth and Planetary Sciences(all)
- Earth and Planetary Sciences (miscellaneous)
- Environmental Science(all)
- Environmental Science (miscellaneous)
- Physics and Astronomy(all)
- Instrumentation
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. 4, No. 1W1, 30.05.2017, p. 229-236.
Research output: Contribution to journal › Conference article › Research › peer review
}
TY - JOUR
T1 - Boosted unsupervised multi-source selection for domain adaption
AU - Vogt, K.
AU - Paul, A.
AU - Ostermann, J.
AU - Rottensteiner, F.
AU - Heipke, C.
N1 - Funding information: This work was supported by the German Science Foundation (DFG) under grants OS 295/4-1 and HE 1822/30-1 and in the context of the research training group GRK2159 (i.c.sens). The Vaihingen data set was provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) (Cramer, 2010): http://www.ifp.uni-stuttgart.de/dgpf/DKEP-Allg.html.
PY - 2017/5/30
Y1 - 2017/5/30
N2 - Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.
AB - Supervised machine learning needs high quality, densely sampled and labelled training data. Transfer learning (TL) techniques have been devised to reduce this dependency by adapting classifiers trained on different, but related, (source) training data to new (target) data sets. A problem in TL is how to quantify the relatedness of a source quickly and robustly, because transferring knowledge from unrelated data can degrade the performance of a classifier. In this paper, we propose a method that can select a nearly optimal source from a large number of candidate sources. This operation depends only on the marginal probability distributions of the data, thus allowing the use of the often abundant unlabelled data. We extend this method to multi-source selection by optimizing a weighted combination of sources. The source weights are computed using a very fast boosting-like optimization scheme. The run-time complexity of our method scales linearly in regard to the number of candidate sources and the size of the training set and is thus applicable to very large data sets. We also propose a modification of an existing TL algorithm to handle multiple weighted training sets. Our method is evaluated on five survey regions. The experiments show that our source selection method is effective in discriminating between related and unrelated sources, almost always generating results within 3% in overall accuracy of a classifier based on fully labelled training data. We also show that using the selected source as training data for a TL method will additionally result in a performance improvement.
KW - Domain Adaptation
KW - Machine Learning
KW - Negative Transfer
KW - Remote Sensing
KW - Source Selection
KW - Transfer Learning
UR - http://www.scopus.com/inward/record.url?scp=85026639000&partnerID=8YFLogxK
U2 - 10.5194/isprs-annals-IV-1-W1-229-2017
DO - 10.5194/isprs-annals-IV-1-W1-229-2017
M3 - Conference article
AN - SCOPUS:85026639000
VL - 4
SP - 229
EP - 236
JO - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
JF - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
SN - 2194-9042
IS - 1W1
T2 - ISPRS Hannover Workshop 2017 on High-Resolution Earth Imaging for Geospatial Information, HRIGI 2017, City Models, Roads and Traffic , CMRT 2017, Image Sequence Analysis, ISA 2017, European Calibration and Orientation Workshop, EuroCOW 2017
Y2 - 6 June 2017 through 9 June 2017
ER -