Constructing Efficient Information Extraction Pipelines

Henning Wachsmuth; Benno Stein; Gregor Engels

doi:10.1145/2063576.2063935

Details

Originalsprache	Englisch
Titel des Sammelwerks	CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
Erscheinungsort	New York
Herausgeber (Verlag)	Association for Computing Machinery (ACM)
Seiten	2237-2240
Seitenumfang	4
ISBN (Print)	9781450307178
Publikationsstatus	Veröffentlicht - Okt. 2011
Extern publiziert	Ja
Veranstaltung	20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, Großbritannien / Vereinigtes Königreich Dauer: 24 Okt. 2011 → 28 Okt. 2011

Abstract

Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.

ASJC Scopus Sachgebiete

Entscheidungswissenschaften (insg.)
Allgemeine Entscheidungswissenschaften
Betriebswirtschaft, Management und Rechnungswesen (insg.)
Allgemeine Unternehmensführung und Buchhaltung

Zitieren

Constructing Efficient Information Extraction Pipelines. / Wachsmuth, Henning; Stein, Benno; Engels, Gregor.
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management. New York: Association for Computing Machinery (ACM), 2011. S. 2237-2240.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung

Wachsmuth, H, Stein, B & Engels, G 2011, Constructing Efficient Information Extraction Pipelines. in CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management. Association for Computing Machinery (ACM), New York, S. 2237-2240, 20th ACM Conference on Information and Knowledge Management, CIKM'11, Glasgow, Großbritannien / Vereinigtes Königreich, 24 Okt. 2011. https://doi.org/10.1145/2063576.2063935

Wachsmuth, H., Stein, B., & Engels, G. (2011). Constructing Efficient Information Extraction Pipelines. In CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management (S. 2237-2240). Association for Computing Machinery (ACM). https://doi.org/10.1145/2063576.2063935

Wachsmuth H, Stein B, Engels G. Constructing Efficient Information Extraction Pipelines. in CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management. New York: Association for Computing Machinery (ACM). 2011. S. 2237-2240 doi: 10.1145/2063576.2063935

Wachsmuth, Henning ; Stein, Benno ; Engels, Gregor. / Constructing Efficient Information Extraction Pipelines. CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management. New York : Association for Computing Machinery (ACM), 2011. S. 2237-2240

Download

@inproceedings{546437cc6b654257b4d4c722f3a093ea,

title = "Constructing Efficient Information Extraction Pipelines",

abstract = "Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much {"}efficiency potential{"} depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.",

keywords = "information extraction, run-time efficiency",

author = "Henning Wachsmuth and Benno Stein and Gregor Engels",

year = "2011",

month = oct,

doi = "10.1145/2063576.2063935",

language = "English",

isbn = "9781450307178",

pages = "2237--2240",

booktitle = "CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management",

publisher = "Association for Computing Machinery (ACM)",

address = "United States",

note = "20th ACM Conference on Information and Knowledge Management, CIKM'11 ; Conference date: 24-10-2011 Through 28-10-2011",

}

Download

TY - GEN

T1 - Constructing Efficient Information Extraction Pipelines

AU - Wachsmuth, Henning

AU - Stein, Benno

AU - Engels, Gregor

PY - 2011/10

Y1 - 2011/10

N2 - Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.

AB - Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.

KW - information extraction

KW - run-time efficiency

UR - http://www.scopus.com/inward/record.url?scp=83055186740&partnerID=8YFLogxK

U2 - 10.1145/2063576.2063935

DO - 10.1145/2063576.2063935

M3 - Conference contribution

AN - SCOPUS:83055186740

SN - 9781450307178

SP - 2237

EP - 2240

BT - CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

PB - Association for Computing Machinery (ACM)

CY - New York

T2 - 20th ACM Conference on Information and Knowledge Management, CIKM'11

Y2 - 24 October 2011 through 28 October 2011

ER -

Research@Leibniz University

Constructing Efficient Information Extraction Pipelines

Autoren

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

Analyzing the Use of Metaphors in News Editorials for Political Framing

A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality

Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation

Modeling the Quality of Dialogical Explanations