Interpretable Semantic Photo Geolocation

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Jonas Theiner
  • Eric Müller-Budack
  • Ralph Ewerth

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publication2022 IEEE Winter Conference on Applications of Computer Vision
Subtitle of host publicationWACV 2022
Pages1474-1484
Number of pages11
ISBN (electronic)978-1-6654-0915-5
Publication statusPublished - 2022

Abstract

Planet-scale photo geolocalization is the complex task of estimating the location depicted in an image solely based on its visual content. Due to the success of convolutional neural networks (CNNs), current approaches achieve super-human performance. However, previous work has exclusively focused on optimizing geolocalization accuracy. Due to the black-box property of deep learning systems, their predictions are difficult to validate for humans. State-of-the-art methods treat the task as a classification problem, where the choice of the classes, that is the partitioning of the world map, is crucial for the performance. In this paper, we present two contributions to improve the interpretability of a geolocalization model: (1) We propose a novel semantic partitioning method which intuitively leads to an improved understanding of the predictions, while achieving state-of-the-art results for geolocational accuracy on benchmark test sets; (2) We introduce a metric to assess the importance of semantic visual concepts for a certain prediction to provide additional interpretable information, which allows for a large-scale analysis of already trained models. Source code and dataset are publicly available.

Keywords

    cs.CV, Evaluation and Comparison of Vision Algorithms, Large-scale Vision Applications Datasets

ASJC Scopus subject areas

Cite this

Interpretable Semantic Photo Geolocation. / Theiner, Jonas; Müller-Budack, Eric; Ewerth, Ralph.
2022 IEEE Winter Conference on Applications of Computer Vision: WACV 2022. 2022. p. 1474-1484.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Theiner, J, Müller-Budack, E & Ewerth, R 2022, Interpretable Semantic Photo Geolocation. in 2022 IEEE Winter Conference on Applications of Computer Vision: WACV 2022. pp. 1474-1484. https://doi.org/10.48550/arXiv.2104.14995, https://doi.org/10.1109/WACV51458.2022.00154
Theiner, J., Müller-Budack, E., & Ewerth, R. (2022). Interpretable Semantic Photo Geolocation. In 2022 IEEE Winter Conference on Applications of Computer Vision: WACV 2022 (pp. 1474-1484) https://doi.org/10.48550/arXiv.2104.14995, https://doi.org/10.1109/WACV51458.2022.00154
Theiner J, Müller-Budack E, Ewerth R. Interpretable Semantic Photo Geolocation. In 2022 IEEE Winter Conference on Applications of Computer Vision: WACV 2022. 2022. p. 1474-1484 doi: 10.48550/arXiv.2104.14995, 10.1109/WACV51458.2022.00154
Theiner, Jonas ; Müller-Budack, Eric ; Ewerth, Ralph. / Interpretable Semantic Photo Geolocation. 2022 IEEE Winter Conference on Applications of Computer Vision: WACV 2022. 2022. pp. 1474-1484
Download
@inproceedings{7ad022e076934a9ab6e47bae994a6e43,
title = "Interpretable Semantic Photo Geolocation",
abstract = "Planet-scale photo geolocalization is the complex task of estimating the location depicted in an image solely based on its visual content. Due to the success of convolutional neural networks (CNNs), current approaches achieve super-human performance. However, previous work has exclusively focused on optimizing geolocalization accuracy. Due to the black-box property of deep learning systems, their predictions are difficult to validate for humans. State-of-the-art methods treat the task as a classification problem, where the choice of the classes, that is the partitioning of the world map, is crucial for the performance. In this paper, we present two contributions to improve the interpretability of a geolocalization model: (1) We propose a novel semantic partitioning method which intuitively leads to an improved understanding of the predictions, while achieving state-of-the-art results for geolocational accuracy on benchmark test sets; (2) We introduce a metric to assess the importance of semantic visual concepts for a certain prediction to provide additional interpretable information, which allows for a large-scale analysis of already trained models. Source code and dataset are publicly available. ",
keywords = "cs.CV, Evaluation and Comparison of Vision Algorithms, Large-scale Vision Applications Datasets",
author = "Jonas Theiner and Eric M{\"u}ller-Budack and Ralph Ewerth",
note = "Funding Information: This project has partially received funding from the German Research Foundation (DFG: Deutsche Forschungsge-meinschaft, project number: 442397862).",
year = "2022",
doi = "10.48550/arXiv.2104.14995",
language = "English",
isbn = "978-1-6654-0916-2",
pages = "1474--1484",
booktitle = "2022 IEEE Winter Conference on Applications of Computer Vision",

}

Download

TY - GEN

T1 - Interpretable Semantic Photo Geolocation

AU - Theiner, Jonas

AU - Müller-Budack, Eric

AU - Ewerth, Ralph

N1 - Funding Information: This project has partially received funding from the German Research Foundation (DFG: Deutsche Forschungsge-meinschaft, project number: 442397862).

PY - 2022

Y1 - 2022

N2 - Planet-scale photo geolocalization is the complex task of estimating the location depicted in an image solely based on its visual content. Due to the success of convolutional neural networks (CNNs), current approaches achieve super-human performance. However, previous work has exclusively focused on optimizing geolocalization accuracy. Due to the black-box property of deep learning systems, their predictions are difficult to validate for humans. State-of-the-art methods treat the task as a classification problem, where the choice of the classes, that is the partitioning of the world map, is crucial for the performance. In this paper, we present two contributions to improve the interpretability of a geolocalization model: (1) We propose a novel semantic partitioning method which intuitively leads to an improved understanding of the predictions, while achieving state-of-the-art results for geolocational accuracy on benchmark test sets; (2) We introduce a metric to assess the importance of semantic visual concepts for a certain prediction to provide additional interpretable information, which allows for a large-scale analysis of already trained models. Source code and dataset are publicly available.

AB - Planet-scale photo geolocalization is the complex task of estimating the location depicted in an image solely based on its visual content. Due to the success of convolutional neural networks (CNNs), current approaches achieve super-human performance. However, previous work has exclusively focused on optimizing geolocalization accuracy. Due to the black-box property of deep learning systems, their predictions are difficult to validate for humans. State-of-the-art methods treat the task as a classification problem, where the choice of the classes, that is the partitioning of the world map, is crucial for the performance. In this paper, we present two contributions to improve the interpretability of a geolocalization model: (1) We propose a novel semantic partitioning method which intuitively leads to an improved understanding of the predictions, while achieving state-of-the-art results for geolocational accuracy on benchmark test sets; (2) We introduce a metric to assess the importance of semantic visual concepts for a certain prediction to provide additional interpretable information, which allows for a large-scale analysis of already trained models. Source code and dataset are publicly available.

KW - cs.CV

KW - Evaluation and Comparison of Vision Algorithms

KW - Large-scale Vision Applications Datasets

UR - http://www.scopus.com/inward/record.url?scp=85126088869&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2104.14995

DO - 10.48550/arXiv.2104.14995

M3 - Conference contribution

SN - 978-1-6654-0916-2

SP - 1474

EP - 1484

BT - 2022 IEEE Winter Conference on Applications of Computer Vision

ER -