Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines

Mohamad Yaser Jaradeh; Kuldeep Singh; Markus Stocker; Andreas Both; Sören Auer

doi:10.1007/978-3-030-74296-6_19

Details

Originalsprache	Englisch
Titel des Sammelwerks	Web Engineering
Untertitel	21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings
Herausgeber/-innen	Marco Brambilla, Richard Chbeir, Flavius Frasincar, Ioana Manolescu
Herausgeber (Verlag)	Springer Science and Business Media Deutschland GmbH
Seiten	240-254
Seitenumfang	15
ISBN (Print)	9783030742959
Publikationsstatus	Veröffentlicht - 2021
Veranstaltung	21st International Conference on Web Engineering, ICWE 2021 - Virtual, Online Dauer: 18 Mai 2021 → 21 Mai 2021

Publikationsreihe

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band	12706 LNCS
ISSN (Print)	0302-9743
ISSN (elektronisch)	1611-3349

Abstract

We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

ASJC Scopus Sachgebiete

Mathematik (insg.)
Theoretische Informatik
Informatik (insg.)
Allgemeine Computerwissenschaft

Zitieren

Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. / Jaradeh, Mohamad Yaser; Singh, Kuldeep; Stocker, Markus et al.
Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Hrsg. / Marco Brambilla; Richard Chbeir; Flavius Frasincar; Ioana Manolescu. Springer Science and Business Media Deutschland GmbH, 2021. S. 240-254 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 12706 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Jaradeh, MY, Singh, K, Stocker, M, Both, A & Auer, S 2021, Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. in M Brambilla, R Chbeir, F Frasincar & I Manolescu (Hrsg.), Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Bd. 12706 LNCS, Springer Science and Business Media Deutschland GmbH, S. 240-254, 21st International Conference on Web Engineering, ICWE 2021, Virtual, Online, 18 Mai 2021. https://doi.org/10.1007/978-3-030-74296-6_19

Jaradeh, M. Y., Singh, K., Stocker, M., Both, A., & Auer, S. (2021). Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. In M. Brambilla, R. Chbeir, F. Frasincar, & I. Manolescu (Hrsg.), Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings (S. 240-254). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 12706 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-74296-6_19

Jaradeh MY, Singh K, Stocker M, Both A, Auer S. Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. in Brambilla M, Chbeir R, Frasincar F, Manolescu I, Hrsg., Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Springer Science and Business Media Deutschland GmbH. 2021. S. 240-254. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2021 Mai 11. doi: 10.1007/978-3-030-74296-6_19

Jaradeh, Mohamad Yaser ; Singh, Kuldeep ; Stocker, Markus et al. / Better Call the Plumber : Orchestrating Dynamic Information Extraction Pipelines. Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Hrsg. / Marco Brambilla ; Richard Chbeir ; Flavius Frasincar ; Ioana Manolescu. Springer Science and Business Media Deutschland GmbH, 2021. S. 240-254 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{6df4b5e334ab48aaabb0bf8323059def,

title = "Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines",

abstract = "We propose Plumber, the first framework that brings together the research community{\textquoteright}s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.",

keywords = "Information extraction, NLP pipelines, Semantic search, Semantic Web, Software reusability",

author = "Jaradeh, {Mohamad Yaser} and Kuldeep Singh and Markus Stocker and Andreas Both and S{\"o}ren Auer",

note = "Funding Information: Acknowledgements. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. ; 21st International Conference on Web Engineering, ICWE 2021 ; Conference date: 18-05-2021 Through 21-05-2021",

year = "2021",

doi = "10.1007/978-3-030-74296-6_19",

language = "English",

isbn = "9783030742959",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "240--254",

editor = "Marco Brambilla and Richard Chbeir and Flavius Frasincar and Ioana Manolescu",

booktitle = "Web Engineering",

address = "Germany",

}

Download

TY - GEN

T1 - Better Call the Plumber

T2 - 21st International Conference on Web Engineering, ICWE 2021

AU - Jaradeh, Mohamad Yaser

AU - Singh, Kuldeep

AU - Stocker, Markus

AU - Both, Andreas

AU - Auer, Sören

N1 - Funding Information: Acknowledgements. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology.

PY - 2021

Y1 - 2021

N2 - We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

AB - We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

KW - Information extraction

KW - NLP pipelines

KW - Semantic search

KW - Semantic Web

KW - Software reusability

UR - http://www.scopus.com/inward/record.url?scp=85111157954&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-74296-6_19

DO - 10.1007/978-3-030-74296-6_19

M3 - Conference contribution

AN - SCOPUS:85111157954

SN - 9783030742959

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 240

EP - 254

BT - Web Engineering

A2 - Brambilla, Marco

A2 - Chbeir, Richard

A2 - Frasincar, Flavius

A2 - Manolescu, Ioana

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 18 May 2021 through 21 May 2021

ER -

Research@Leibniz University

Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Organizing Scientific Knowledge from Engineering Sciences Using the Open Research Knowledge Graph: The Tailored Forming Process Chain Use Case

A Neuro-Symbolic Approach for Faceted Search in Digital Libraries