GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

External Research Organisations

  • University of Bonn
View graph of relations

Details

Original languageEnglish
Title of host publicationCIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
Pages4604-4612
Number of pages9
ISBN (electronic)9781450384469
Publication statusPublished - 30 Oct 2021
Event30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australia
Duration: 1 Nov 20215 Nov 2021

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Abstract

OpenStreetMap (OSM) is currently the richest publicly available information source on geographic entities (e.g., buildings and roads) worldwide. However, using OSM entities in machine learning models and other applications is challenging due to the large scale of OSM, the extreme heterogeneity of entity annotations, and a lack of a well-defined ontology to describe entity semantics and properties. This paper presents GeoVectors - a unique, comprehensive world-scale linked open corpus of OSM entity embeddings covering the entire OSM dataset and providing latent representations of over 980 million geographic entities in 180 countries. The GeoVectors corpus captures semantic and geographic dimensions of OSM entities and makes these entities directly accessible to machine learning algorithms and semantic applications. We create a semantic description of the GeoVectors corpus, including identity links to the Wikidata and DBpedia knowledge graphs to supply context information. Furthermore, we provide a SPARQL endpoint - a semantic interface that offers direct access to the semantic and latent representations of geographic entities in OSM.

Keywords

    openstreetmap, OSM embeddings, semantic geographic data

ASJC Scopus subject areas

Cite this

GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale. / Tempelmeier, Nicolas; Gottschalk, Simon; Demidova, Elena.
CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 2021. p. 4604-4612 (International Conference on Information and Knowledge Management, Proceedings).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Tempelmeier, N, Gottschalk, S & Demidova, E 2021, GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale. in CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. International Conference on Information and Knowledge Management, Proceedings, pp. 4604-4612, 30th ACM International Conference on Information and Knowledge Management, CIKM 2021, Virtual, Online, Australia, 1 Nov 2021. https://doi.org/10.48550/arXiv.2108.13092, https://doi.org/10.1145/3459637.3482004
Tempelmeier, N., Gottschalk, S., & Demidova, E. (2021). GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale. In CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management (pp. 4604-4612). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.48550/arXiv.2108.13092, https://doi.org/10.1145/3459637.3482004
Tempelmeier N, Gottschalk S, Demidova E. GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale. In CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 2021. p. 4604-4612. (International Conference on Information and Knowledge Management, Proceedings). doi: 10.48550/arXiv.2108.13092, 10.1145/3459637.3482004
Tempelmeier, Nicolas ; Gottschalk, Simon ; Demidova, Elena. / GeoVectors : A Linked Open Corpus of OpenStreetMap Embeddings on World Scale. CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 2021. pp. 4604-4612 (International Conference on Information and Knowledge Management, Proceedings).
Download
@inproceedings{f2acacb073334c02959c9e2119e1f31e,
title = "GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale",
abstract = "OpenStreetMap (OSM) is currently the richest publicly available information source on geographic entities (e.g., buildings and roads) worldwide. However, using OSM entities in machine learning models and other applications is challenging due to the large scale of OSM, the extreme heterogeneity of entity annotations, and a lack of a well-defined ontology to describe entity semantics and properties. This paper presents GeoVectors - a unique, comprehensive world-scale linked open corpus of OSM entity embeddings covering the entire OSM dataset and providing latent representations of over 980 million geographic entities in 180 countries. The GeoVectors corpus captures semantic and geographic dimensions of OSM entities and makes these entities directly accessible to machine learning algorithms and semantic applications. We create a semantic description of the GeoVectors corpus, including identity links to the Wikidata and DBpedia knowledge graphs to supply context information. Furthermore, we provide a SPARQL endpoint - a semantic interface that offers direct access to the semantic and latent representations of geographic entities in OSM.",
keywords = "openstreetmap, OSM embeddings, semantic geographic data",
author = "Nicolas Tempelmeier and Simon Gottschalk and Elena Demidova",
note = "Funding Information: Acknowledgements. This work was partially funded by DFG, German Research Foundation (“WorldKG”, 424985896), the Federal Ministry of Education and Research (BMBF), Germany (“Simple-ML”, 01IS18054), the Federal Ministry for Economic Affairs and Energy (BMWi), Germany (“d-E-mand”, 01ME19009B), and the European Commission (EU H2020, “smashHit”, grant-ID 871477). ; 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 ; Conference date: 01-11-2021 Through 05-11-2021",
year = "2021",
month = oct,
day = "30",
doi = "10.48550/arXiv.2108.13092",
language = "English",
series = "International Conference on Information and Knowledge Management, Proceedings",
pages = "4604--4612",
booktitle = "CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management",

}

Download

TY - GEN

T1 - GeoVectors

T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021

AU - Tempelmeier, Nicolas

AU - Gottschalk, Simon

AU - Demidova, Elena

N1 - Funding Information: Acknowledgements. This work was partially funded by DFG, German Research Foundation (“WorldKG”, 424985896), the Federal Ministry of Education and Research (BMBF), Germany (“Simple-ML”, 01IS18054), the Federal Ministry for Economic Affairs and Energy (BMWi), Germany (“d-E-mand”, 01ME19009B), and the European Commission (EU H2020, “smashHit”, grant-ID 871477).

PY - 2021/10/30

Y1 - 2021/10/30

N2 - OpenStreetMap (OSM) is currently the richest publicly available information source on geographic entities (e.g., buildings and roads) worldwide. However, using OSM entities in machine learning models and other applications is challenging due to the large scale of OSM, the extreme heterogeneity of entity annotations, and a lack of a well-defined ontology to describe entity semantics and properties. This paper presents GeoVectors - a unique, comprehensive world-scale linked open corpus of OSM entity embeddings covering the entire OSM dataset and providing latent representations of over 980 million geographic entities in 180 countries. The GeoVectors corpus captures semantic and geographic dimensions of OSM entities and makes these entities directly accessible to machine learning algorithms and semantic applications. We create a semantic description of the GeoVectors corpus, including identity links to the Wikidata and DBpedia knowledge graphs to supply context information. Furthermore, we provide a SPARQL endpoint - a semantic interface that offers direct access to the semantic and latent representations of geographic entities in OSM.

AB - OpenStreetMap (OSM) is currently the richest publicly available information source on geographic entities (e.g., buildings and roads) worldwide. However, using OSM entities in machine learning models and other applications is challenging due to the large scale of OSM, the extreme heterogeneity of entity annotations, and a lack of a well-defined ontology to describe entity semantics and properties. This paper presents GeoVectors - a unique, comprehensive world-scale linked open corpus of OSM entity embeddings covering the entire OSM dataset and providing latent representations of over 980 million geographic entities in 180 countries. The GeoVectors corpus captures semantic and geographic dimensions of OSM entities and makes these entities directly accessible to machine learning algorithms and semantic applications. We create a semantic description of the GeoVectors corpus, including identity links to the Wikidata and DBpedia knowledge graphs to supply context information. Furthermore, we provide a SPARQL endpoint - a semantic interface that offers direct access to the semantic and latent representations of geographic entities in OSM.

KW - openstreetmap

KW - OSM embeddings

KW - semantic geographic data

UR - http://www.scopus.com/inward/record.url?scp=85119213256&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2108.13092

DO - 10.48550/arXiv.2108.13092

M3 - Conference contribution

AN - SCOPUS:85119213256

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 4604

EP - 4612

BT - CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management

Y2 - 1 November 2021 through 5 November 2021

ER -

By the same author(s)