Details
Original language | English |
---|---|
Pages (from-to) | 365-380 |
Number of pages | 16 |
Journal | PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science |
Volume | 91 |
Issue number | 5 |
Early online date | 28 Jul 2023 |
Publication status | Published - Oct 2023 |
Abstract
Dense depth information can be reconstructed from stereo images using conventional hand-crafted as well as deep learning-based approaches. While deep-learning methods often show superior results compared to hand-crafted ones, they commonly learn geometric principles underlying the matching task from scratch and neglect that these principles have already been intensively studied and were considered explicitly in various models with great success in the past. In consequence, a broad range of principles and associated features need to be learned, limiting the possibility to focus on important details to also succeed in challenging image regions, such as close to depth discontinuities, thin objects and in weakly textured areas. To overcome this limitation, in this work, a hybrid technique, i.e., a combination of conventional hand-crafted and deep learning-based methods, is presented, addressing the task of dense stereo matching. More precisely, the input RGB stereo images are supplemented by a fourth image channel containing feature information obtained with a method based on expert knowledge. In addition, the assumption that edges in an image and discontinuities in the corresponding depth map coincide is modeled explicitly, allowing to predict the probability of being located next to a depth discontinuity per pixel. This information is used to guide the matching process and helps to sharpen correct depth discontinuities and to avoid the false prediction of such discontinuities, especially in weakly textured areas. The performance of the proposed method is investigated on three different data sets, including studies on the influence of the two methodological components as well as on the generalization capability. The results demonstrate that the presented hybrid approach can help to mitigate common limitations of deep learning-based methods and improves the quality of the estimated depth maps.
Keywords
- 3D reconstruction, Depth estimation, Hybrid technique, Image matching
ASJC Scopus subject areas
- Social Sciences(all)
- Geography, Planning and Development
- Physics and Astronomy(all)
- Instrumentation
- Earth and Planetary Sciences(all)
- Earth and Planetary Sciences (miscellaneous)
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science, Vol. 91, No. 5, 10.2023, p. 365-380.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Guiding Deep Learning with Expert Knowledge for Dense Stereo Matching
AU - Iqbal, Waseem
AU - Paffenholz, Jens André
AU - Mehltretter, Max
N1 - Funding Information: Open Access funding enabled and organized by Projekt DEAL
PY - 2023/10
Y1 - 2023/10
N2 - Dense depth information can be reconstructed from stereo images using conventional hand-crafted as well as deep learning-based approaches. While deep-learning methods often show superior results compared to hand-crafted ones, they commonly learn geometric principles underlying the matching task from scratch and neglect that these principles have already been intensively studied and were considered explicitly in various models with great success in the past. In consequence, a broad range of principles and associated features need to be learned, limiting the possibility to focus on important details to also succeed in challenging image regions, such as close to depth discontinuities, thin objects and in weakly textured areas. To overcome this limitation, in this work, a hybrid technique, i.e., a combination of conventional hand-crafted and deep learning-based methods, is presented, addressing the task of dense stereo matching. More precisely, the input RGB stereo images are supplemented by a fourth image channel containing feature information obtained with a method based on expert knowledge. In addition, the assumption that edges in an image and discontinuities in the corresponding depth map coincide is modeled explicitly, allowing to predict the probability of being located next to a depth discontinuity per pixel. This information is used to guide the matching process and helps to sharpen correct depth discontinuities and to avoid the false prediction of such discontinuities, especially in weakly textured areas. The performance of the proposed method is investigated on three different data sets, including studies on the influence of the two methodological components as well as on the generalization capability. The results demonstrate that the presented hybrid approach can help to mitigate common limitations of deep learning-based methods and improves the quality of the estimated depth maps.
AB - Dense depth information can be reconstructed from stereo images using conventional hand-crafted as well as deep learning-based approaches. While deep-learning methods often show superior results compared to hand-crafted ones, they commonly learn geometric principles underlying the matching task from scratch and neglect that these principles have already been intensively studied and were considered explicitly in various models with great success in the past. In consequence, a broad range of principles and associated features need to be learned, limiting the possibility to focus on important details to also succeed in challenging image regions, such as close to depth discontinuities, thin objects and in weakly textured areas. To overcome this limitation, in this work, a hybrid technique, i.e., a combination of conventional hand-crafted and deep learning-based methods, is presented, addressing the task of dense stereo matching. More precisely, the input RGB stereo images are supplemented by a fourth image channel containing feature information obtained with a method based on expert knowledge. In addition, the assumption that edges in an image and discontinuities in the corresponding depth map coincide is modeled explicitly, allowing to predict the probability of being located next to a depth discontinuity per pixel. This information is used to guide the matching process and helps to sharpen correct depth discontinuities and to avoid the false prediction of such discontinuities, especially in weakly textured areas. The performance of the proposed method is investigated on three different data sets, including studies on the influence of the two methodological components as well as on the generalization capability. The results demonstrate that the presented hybrid approach can help to mitigate common limitations of deep learning-based methods and improves the quality of the estimated depth maps.
KW - 3D reconstruction
KW - Depth estimation
KW - Hybrid technique
KW - Image matching
UR - http://www.scopus.com/inward/record.url?scp=85165968631&partnerID=8YFLogxK
U2 - 10.1007/s41064-023-00252-0
DO - 10.1007/s41064-023-00252-0
M3 - Article
AN - SCOPUS:85165968631
VL - 91
SP - 365
EP - 380
JO - PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science
JF - PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science
SN - 2512-2789
IS - 5
ER -