Combining Programming-by-Example with Transformation Discovery from large Databases

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Aslihan Özmen
  • Mahdi Esmailoghli
  • Ziawasch Abedjan

Externe Organisationen

  • Technische Universität Berlin
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksDatenbanksysteme für Business, Technologie und Web (BTW 2021)
Untertitel19. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 13.-17. September 2021, Dresden, Germany, Proceedings
Herausgeber/-innenKai-Uwe Sattler, Melanie Herschel, Wolfgang Lehner
Herausgeber (Verlag)Gesellschaft fur Informatik (GI)
Seiten313-324
Seitenumfang12
ISBN (elektronisch)978-3-88579-705-0
PublikationsstatusVeröffentlicht - 2021

Publikationsreihe

NameLecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)
BandP-311
ISSN (Print)1617-5468

Abstract

Data transformation discovery is one of the most tedious tasks in data preparation. In particular, the generation of transformation programs for semantic transformations is tricky because additional sources for look-up operations are necessary. Current systems for semantic transformation discovery face two major problems: either they follow a program synthesis approach that only scales to a small set of input tables, or they rely on extraction of transformation functions from large corpora, which requires the identification of exact transformations in those resources and is prone to noisy data. In this paper, we try to combine approaches to benefit from large corpora and the sophistication of program synthesis. To do so, we devise a retrieval and pruning strategy ensemble that extracts the most relevant tables for a given transformation task. The extracted resources can then be processed by a program synthesis engine to generate more accurate transformation results than state-of-the-art.

ASJC Scopus Sachgebiete

Zitieren

Combining Programming-by-Example with Transformation Discovery from large Databases. / Özmen, Aslihan; Esmailoghli, Mahdi; Abedjan, Ziawasch.
Datenbanksysteme für Business, Technologie und Web (BTW 2021): 19. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 13.-17. September 2021, Dresden, Germany, Proceedings. Hrsg. / Kai-Uwe Sattler; Melanie Herschel; Wolfgang Lehner. Gesellschaft fur Informatik (GI), 2021. S. 313-324 (Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI); Band P-311).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Özmen, A, Esmailoghli, M & Abedjan, Z 2021, Combining Programming-by-Example with Transformation Discovery from large Databases. in K-U Sattler, M Herschel & W Lehner (Hrsg.), Datenbanksysteme für Business, Technologie und Web (BTW 2021): 19. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 13.-17. September 2021, Dresden, Germany, Proceedings. Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI), Bd. P-311, Gesellschaft fur Informatik (GI), S. 313-324. https://doi.org/10.18420/BTW2021-16
Özmen, A., Esmailoghli, M., & Abedjan, Z. (2021). Combining Programming-by-Example with Transformation Discovery from large Databases. In K.-U. Sattler, M. Herschel, & W. Lehner (Hrsg.), Datenbanksysteme für Business, Technologie und Web (BTW 2021): 19. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 13.-17. September 2021, Dresden, Germany, Proceedings (S. 313-324). (Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI); Band P-311). Gesellschaft fur Informatik (GI). https://doi.org/10.18420/BTW2021-16
Özmen A, Esmailoghli M, Abedjan Z. Combining Programming-by-Example with Transformation Discovery from large Databases. in Sattler KU, Herschel M, Lehner W, Hrsg., Datenbanksysteme für Business, Technologie und Web (BTW 2021): 19. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 13.-17. September 2021, Dresden, Germany, Proceedings. Gesellschaft fur Informatik (GI). 2021. S. 313-324. (Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)). doi: 10.18420/BTW2021-16
Özmen, Aslihan ; Esmailoghli, Mahdi ; Abedjan, Ziawasch. / Combining Programming-by-Example with Transformation Discovery from large Databases. Datenbanksysteme für Business, Technologie und Web (BTW 2021): 19. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 13.-17. September 2021, Dresden, Germany, Proceedings. Hrsg. / Kai-Uwe Sattler ; Melanie Herschel ; Wolfgang Lehner. Gesellschaft fur Informatik (GI), 2021. S. 313-324 (Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)).
Download
@inproceedings{b09f6f4600b34fffb579ec2ffa531f61,
title = "Combining Programming-by-Example with Transformation Discovery from large Databases",
abstract = "Data transformation discovery is one of the most tedious tasks in data preparation. In particular, the generation of transformation programs for semantic transformations is tricky because additional sources for look-up operations are necessary. Current systems for semantic transformation discovery face two major problems: either they follow a program synthesis approach that only scales to a small set of input tables, or they rely on extraction of transformation functions from large corpora, which requires the identification of exact transformations in those resources and is prone to noisy data. In this paper, we try to combine approaches to benefit from large corpora and the sophistication of program synthesis. To do so, we devise a retrieval and pruning strategy ensemble that extracts the most relevant tables for a given transformation task. The extracted resources can then be processed by a program synthesis engine to generate more accurate transformation results than state-of-the-art.",
author = "Aslihan {\"O}zmen and Mahdi Esmailoghli and Ziawasch Abedjan",
note = "Funding information:. This project has been supported by the German Research Foundation (DFG) under grant agreement 387872445.",
year = "2021",
doi = "10.18420/BTW2021-16",
language = "English",
series = "Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)",
publisher = "Gesellschaft fur Informatik (GI)",
pages = "313--324",
editor = "Kai-Uwe Sattler and Melanie Herschel and Wolfgang Lehner",
booktitle = "Datenbanksysteme f{\"u}r Business, Technologie und Web (BTW 2021)",
address = "Germany",

}

Download

TY - GEN

T1 - Combining Programming-by-Example with Transformation Discovery from large Databases

AU - Özmen, Aslihan

AU - Esmailoghli, Mahdi

AU - Abedjan, Ziawasch

N1 - Funding information:. This project has been supported by the German Research Foundation (DFG) under grant agreement 387872445.

PY - 2021

Y1 - 2021

N2 - Data transformation discovery is one of the most tedious tasks in data preparation. In particular, the generation of transformation programs for semantic transformations is tricky because additional sources for look-up operations are necessary. Current systems for semantic transformation discovery face two major problems: either they follow a program synthesis approach that only scales to a small set of input tables, or they rely on extraction of transformation functions from large corpora, which requires the identification of exact transformations in those resources and is prone to noisy data. In this paper, we try to combine approaches to benefit from large corpora and the sophistication of program synthesis. To do so, we devise a retrieval and pruning strategy ensemble that extracts the most relevant tables for a given transformation task. The extracted resources can then be processed by a program synthesis engine to generate more accurate transformation results than state-of-the-art.

AB - Data transformation discovery is one of the most tedious tasks in data preparation. In particular, the generation of transformation programs for semantic transformations is tricky because additional sources for look-up operations are necessary. Current systems for semantic transformation discovery face two major problems: either they follow a program synthesis approach that only scales to a small set of input tables, or they rely on extraction of transformation functions from large corpora, which requires the identification of exact transformations in those resources and is prone to noisy data. In this paper, we try to combine approaches to benefit from large corpora and the sophistication of program synthesis. To do so, we devise a retrieval and pruning strategy ensemble that extracts the most relevant tables for a given transformation task. The extracted resources can then be processed by a program synthesis engine to generate more accurate transformation results than state-of-the-art.

UR - http://www.scopus.com/inward/record.url?scp=85130137666&partnerID=8YFLogxK

U2 - 10.18420/BTW2021-16

DO - 10.18420/BTW2021-16

M3 - Conference contribution

T3 - Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)

SP - 313

EP - 324

BT - Datenbanksysteme für Business, Technologie und Web (BTW 2021)

A2 - Sattler, Kai-Uwe

A2 - Herschel, Melanie

A2 - Lehner, Wolfgang

PB - Gesellschaft fur Informatik (GI)

ER -