SimMatching: Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Michael Schäfers
  • Udo W. Lipeck
View graph of relations

Details

Original languageEnglish
Title of host publicationSIGSPATIAL PhD '14
Subtitle of host publication Proceedings of the 1st ACM SIGSPATIAL PhD Workshop
EditorsUgur Demiryurek, Mohamed Sarwat
Number of pages5
ISBN (electronic)9781450331586
Publication statusPublished - 4 Nov 2014
Event2014 1st ACM SIGPATIAL PhD Workshop, SIGSPATIAL PhD 2014 - In Conjunction with 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2014 - Dallas, United States
Duration: 4 Nov 20147 Nov 2014

Abstract

Spatial data integration is a challenging task due to the high degree of diversity between different geodata sources, the inherent complexity of objects, and the large size of datasets. To avoid duplicates in an integrated dataset, input sources have to be linked on the instance level. By matching spatial objects, multiple representations of the same real-world entity shall be identified based on similarity computation. In this paper, we present an approach for similarity-based spatial matching of road networks. Our SimMatching algorithm adapts to a variety of input data characteristics by using weighted similarity measures. Geometric and semantic attributes are considered as well as the dataset topology to enhance similarity computations with relational measures. We use a greedy approach and optimizations to keep the number of match candidates minimal all the time. This allows very low runtimes while giving high quality matching results. Supported by a partitioning framework and parallel processing, it also guarantees scalability to large datasets.

Keywords

    Data Matching, Road Networks, Scalability, Similarity, Spatial Data Integration, Spatial Databases

ASJC Scopus subject areas

Cite this

SimMatching: Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration. / Schäfers, Michael; Lipeck, Udo W.
SIGSPATIAL PhD '14: Proceedings of the 1st ACM SIGSPATIAL PhD Workshop. ed. / Ugur Demiryurek; Mohamed Sarwat. 2014. 2694866.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Schäfers, M & Lipeck, UW 2014, SimMatching: Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration. in U Demiryurek & M Sarwat (eds), SIGSPATIAL PhD '14: Proceedings of the 1st ACM SIGSPATIAL PhD Workshop., 2694866, 2014 1st ACM SIGPATIAL PhD Workshop, SIGSPATIAL PhD 2014 - In Conjunction with 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2014, Dallas, Texas, United States, 4 Nov 2014. https://doi.org/10.1145/2694859.2694866
Schäfers, M., & Lipeck, U. W. (2014). SimMatching: Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration. In U. Demiryurek, & M. Sarwat (Eds.), SIGSPATIAL PhD '14: Proceedings of the 1st ACM SIGSPATIAL PhD Workshop Article 2694866 https://doi.org/10.1145/2694859.2694866
Schäfers M, Lipeck UW. SimMatching: Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration. In Demiryurek U, Sarwat M, editors, SIGSPATIAL PhD '14: Proceedings of the 1st ACM SIGSPATIAL PhD Workshop. 2014. 2694866 doi: 10.1145/2694859.2694866
Schäfers, Michael ; Lipeck, Udo W. / SimMatching : Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration. SIGSPATIAL PhD '14: Proceedings of the 1st ACM SIGSPATIAL PhD Workshop. editor / Ugur Demiryurek ; Mohamed Sarwat. 2014.
Download
@inproceedings{fde8ba0f2176431e886b4280275baa11,
title = "SimMatching: Adaptable Road Network Matching for Efficient and Scalable Spatial Data Integration",
abstract = "Spatial data integration is a challenging task due to the high degree of diversity between different geodata sources, the inherent complexity of objects, and the large size of datasets. To avoid duplicates in an integrated dataset, input sources have to be linked on the instance level. By matching spatial objects, multiple representations of the same real-world entity shall be identified based on similarity computation. In this paper, we present an approach for similarity-based spatial matching of road networks. Our SimMatching algorithm adapts to a variety of input data characteristics by using weighted similarity measures. Geometric and semantic attributes are considered as well as the dataset topology to enhance similarity computations with relational measures. We use a greedy approach and optimizations to keep the number of match candidates minimal all the time. This allows very low runtimes while giving high quality matching results. Supported by a partitioning framework and parallel processing, it also guarantees scalability to large datasets.",
keywords = "Data Matching, Road Networks, Scalability, Similarity, Spatial Data Integration, Spatial Databases",
author = "Michael Sch{\"a}fers and Lipeck, {Udo W.}",
year = "2014",
month = nov,
day = "4",
doi = "10.1145/2694859.2694866",
language = "English",
editor = "Ugur Demiryurek and Mohamed Sarwat",
booktitle = "SIGSPATIAL PhD '14",
note = "2014 1st ACM SIGPATIAL PhD Workshop, SIGSPATIAL PhD 2014 - In Conjunction with 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2014 ; Conference date: 04-11-2014 Through 07-11-2014",

}

Download

TY - GEN

T1 - SimMatching

T2 - 2014 1st ACM SIGPATIAL PhD Workshop, SIGSPATIAL PhD 2014 - In Conjunction with 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2014

AU - Schäfers, Michael

AU - Lipeck, Udo W.

PY - 2014/11/4

Y1 - 2014/11/4

N2 - Spatial data integration is a challenging task due to the high degree of diversity between different geodata sources, the inherent complexity of objects, and the large size of datasets. To avoid duplicates in an integrated dataset, input sources have to be linked on the instance level. By matching spatial objects, multiple representations of the same real-world entity shall be identified based on similarity computation. In this paper, we present an approach for similarity-based spatial matching of road networks. Our SimMatching algorithm adapts to a variety of input data characteristics by using weighted similarity measures. Geometric and semantic attributes are considered as well as the dataset topology to enhance similarity computations with relational measures. We use a greedy approach and optimizations to keep the number of match candidates minimal all the time. This allows very low runtimes while giving high quality matching results. Supported by a partitioning framework and parallel processing, it also guarantees scalability to large datasets.

AB - Spatial data integration is a challenging task due to the high degree of diversity between different geodata sources, the inherent complexity of objects, and the large size of datasets. To avoid duplicates in an integrated dataset, input sources have to be linked on the instance level. By matching spatial objects, multiple representations of the same real-world entity shall be identified based on similarity computation. In this paper, we present an approach for similarity-based spatial matching of road networks. Our SimMatching algorithm adapts to a variety of input data characteristics by using weighted similarity measures. Geometric and semantic attributes are considered as well as the dataset topology to enhance similarity computations with relational measures. We use a greedy approach and optimizations to keep the number of match candidates minimal all the time. This allows very low runtimes while giving high quality matching results. Supported by a partitioning framework and parallel processing, it also guarantees scalability to large datasets.

KW - Data Matching

KW - Road Networks

KW - Scalability

KW - Similarity

KW - Spatial Data Integration

KW - Spatial Databases

UR - http://www.scopus.com/inward/record.url?scp=84928024782&partnerID=8YFLogxK

U2 - 10.1145/2694859.2694866

DO - 10.1145/2694859.2694866

M3 - Conference contribution

AN - SCOPUS:84928024782

BT - SIGSPATIAL PhD '14

A2 - Demiryurek, Ugur

A2 - Sarwat, Mohamed

Y2 - 4 November 2014 through 7 November 2014

ER -