Details
Original language | English |
---|---|
Title of host publication | Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 |
Publisher | IEEE Computer Society |
Pages | 14590-14599 |
Number of pages | 10 |
ISBN (electronic) | 9781665445092 |
Publication status | Published - 2021 |
Event | 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Nashville, United States Duration: 20 Jun 2021 → 25 Jun 2021 |
Publication series
Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
ISSN (Print) | 1063-6919 |
Abstract
In many real-world applications, the relative depth of objects in an image is crucial for scene understanding. Recent approaches mainly tackle the problem of depth prediction in monocular images by treating the problem as a regression task. Yet, being interested in an order relation in the first place, ranking methods suggest themselves as a natural alternative to regression, and indeed, ranking approaches leveraging pairwise comparisons as training information (“object A is closer to the camera than B”) have shown promising performance on this problem. In this paper, we elaborate on the use of so-called listwise ranking as a generalization of the pairwise approach. Our method is based on the Plackett-Luce (PL) model, a probability distribution on rankings, which we combine with a state-of-the-art neural network architecture and a simple sampling strategy to reduce training complexity. Moreover, taking advantage of the representation of PL as a random utility model, the proposed predictor offers a natural way to recover (shift-invariant) metric depth information from ranking-only data provided at training time. An empirical evaluation on several benchmark datasets in a “zero-shot” setting demonstrates the effectiveness of our approach compared to existing ranking and regression methods.
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Computer Vision and Pattern Recognition
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society, 2021. p. 14590-14599 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model
AU - Lienen, Julian
AU - Hüllermeier, Eyke
AU - Ewerth, Ralph
AU - Nommensen, Nils
N1 - Funding Information: Motivated by these promising results, we plan to elaborate on further improvements of the listwise ranking approach. This includes an investigation of the effect of varying the ranking size, as well as an extension toward learning from partial rankings and incorporating equality relations. In addition, as we only applied random sampling so far, we plan to develop more sophisticated sampling strategies leading to more informative rankings to learn from. Acknowledgement. This work was supported by the German Research Foundation (DFG) under Grant 3050231323. Moreover, computational resources were provided by the Paderborn Center for Parallel Computing (PC2).
PY - 2021
Y1 - 2021
N2 - In many real-world applications, the relative depth of objects in an image is crucial for scene understanding. Recent approaches mainly tackle the problem of depth prediction in monocular images by treating the problem as a regression task. Yet, being interested in an order relation in the first place, ranking methods suggest themselves as a natural alternative to regression, and indeed, ranking approaches leveraging pairwise comparisons as training information (“object A is closer to the camera than B”) have shown promising performance on this problem. In this paper, we elaborate on the use of so-called listwise ranking as a generalization of the pairwise approach. Our method is based on the Plackett-Luce (PL) model, a probability distribution on rankings, which we combine with a state-of-the-art neural network architecture and a simple sampling strategy to reduce training complexity. Moreover, taking advantage of the representation of PL as a random utility model, the proposed predictor offers a natural way to recover (shift-invariant) metric depth information from ranking-only data provided at training time. An empirical evaluation on several benchmark datasets in a “zero-shot” setting demonstrates the effectiveness of our approach compared to existing ranking and regression methods.
AB - In many real-world applications, the relative depth of objects in an image is crucial for scene understanding. Recent approaches mainly tackle the problem of depth prediction in monocular images by treating the problem as a regression task. Yet, being interested in an order relation in the first place, ranking methods suggest themselves as a natural alternative to regression, and indeed, ranking approaches leveraging pairwise comparisons as training information (“object A is closer to the camera than B”) have shown promising performance on this problem. In this paper, we elaborate on the use of so-called listwise ranking as a generalization of the pairwise approach. Our method is based on the Plackett-Luce (PL) model, a probability distribution on rankings, which we combine with a state-of-the-art neural network architecture and a simple sampling strategy to reduce training complexity. Moreover, taking advantage of the representation of PL as a random utility model, the proposed predictor offers a natural way to recover (shift-invariant) metric depth information from ranking-only data provided at training time. An empirical evaluation on several benchmark datasets in a “zero-shot” setting demonstrates the effectiveness of our approach compared to existing ranking and regression methods.
UR - http://www.scopus.com/inward/record.url?scp=85123220274&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2010.13118
DO - 10.48550/arXiv.2010.13118
M3 - Conference contribution
AN - SCOPUS:85123220274
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 14590
EP - 14599
BT - Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
PB - IEEE Computer Society
T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
Y2 - 20 June 2021 through 25 June 2021
ER -