Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Mohamad Yaser Jaradeh
  • Kuldeep Singh
  • Markus Stocker
  • Andreas Both
  • Sören Auer

Organisationseinheiten

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
  • Hochschule Anhalt
  • Zerotha-Research and Cerence GmbH
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksWeb Engineering
Untertitel21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings
Herausgeber/-innenMarco Brambilla, Richard Chbeir, Flavius Frasincar, Ioana Manolescu
Herausgeber (Verlag)Springer Science and Business Media Deutschland GmbH
Seiten240-254
Seitenumfang15
ISBN (Print)9783030742959
PublikationsstatusVeröffentlicht - 2021
Veranstaltung21st International Conference on Web Engineering, ICWE 2021 - Virtual, Online
Dauer: 18 Mai 202121 Mai 2021

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band12706 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Abstract

We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

ASJC Scopus Sachgebiete

Zitieren

Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. / Jaradeh, Mohamad Yaser; Singh, Kuldeep; Stocker, Markus et al.
Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Hrsg. / Marco Brambilla; Richard Chbeir; Flavius Frasincar; Ioana Manolescu. Springer Science and Business Media Deutschland GmbH, 2021. S. 240-254 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 12706 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Jaradeh, MY, Singh, K, Stocker, M, Both, A & Auer, S 2021, Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. in M Brambilla, R Chbeir, F Frasincar & I Manolescu (Hrsg.), Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Bd. 12706 LNCS, Springer Science and Business Media Deutschland GmbH, S. 240-254, 21st International Conference on Web Engineering, ICWE 2021, Virtual, Online, 18 Mai 2021. https://doi.org/10.1007/978-3-030-74296-6_19
Jaradeh, M. Y., Singh, K., Stocker, M., Both, A., & Auer, S. (2021). Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. In M. Brambilla, R. Chbeir, F. Frasincar, & I. Manolescu (Hrsg.), Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings (S. 240-254). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 12706 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-74296-6_19
Jaradeh MY, Singh K, Stocker M, Both A, Auer S. Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. in Brambilla M, Chbeir R, Frasincar F, Manolescu I, Hrsg., Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Springer Science and Business Media Deutschland GmbH. 2021. S. 240-254. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2021 Mai 11. doi: 10.1007/978-3-030-74296-6_19
Jaradeh, Mohamad Yaser ; Singh, Kuldeep ; Stocker, Markus et al. / Better Call the Plumber : Orchestrating Dynamic Information Extraction Pipelines. Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Hrsg. / Marco Brambilla ; Richard Chbeir ; Flavius Frasincar ; Ioana Manolescu. Springer Science and Business Media Deutschland GmbH, 2021. S. 240-254 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{6df4b5e334ab48aaabb0bf8323059def,
title = "Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines",
abstract = "We propose Plumber, the first framework that brings together the research community{\textquoteright}s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.",
keywords = "Information extraction, NLP pipelines, Semantic search, Semantic Web, Software reusability",
author = "Jaradeh, {Mohamad Yaser} and Kuldeep Singh and Markus Stocker and Andreas Both and S{\"o}ren Auer",
note = "Funding Information: Acknowledgements. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. ; 21st International Conference on Web Engineering, ICWE 2021 ; Conference date: 18-05-2021 Through 21-05-2021",
year = "2021",
doi = "10.1007/978-3-030-74296-6_19",
language = "English",
isbn = "9783030742959",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "240--254",
editor = "Marco Brambilla and Richard Chbeir and Flavius Frasincar and Ioana Manolescu",
booktitle = "Web Engineering",
address = "Germany",

}

Download

TY - GEN

T1 - Better Call the Plumber

T2 - 21st International Conference on Web Engineering, ICWE 2021

AU - Jaradeh, Mohamad Yaser

AU - Singh, Kuldeep

AU - Stocker, Markus

AU - Both, Andreas

AU - Auer, Sören

N1 - Funding Information: Acknowledgements. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology.

PY - 2021

Y1 - 2021

N2 - We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

AB - We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

KW - Information extraction

KW - NLP pipelines

KW - Semantic search

KW - Semantic Web

KW - Software reusability

UR - http://www.scopus.com/inward/record.url?scp=85111157954&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-74296-6_19

DO - 10.1007/978-3-030-74296-6_19

M3 - Conference contribution

AN - SCOPUS:85111157954

SN - 9783030742959

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 240

EP - 254

BT - Web Engineering

A2 - Brambilla, Marco

A2 - Chbeir, Richard

A2 - Frasincar, Flavius

A2 - Manolescu, Ioana

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 18 May 2021 through 21 May 2021

ER -

Von denselben Autoren