Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management |
Erscheinungsort | New York |
Herausgeber (Verlag) | Association for Computing Machinery (ACM) |
Seiten | 2237-2240 |
Seitenumfang | 4 |
ISBN (Print) | 9781450307178 |
Publikationsstatus | Veröffentlicht - Okt. 2011 |
Extern publiziert | Ja |
Veranstaltung | 20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, Großbritannien / Vereinigtes Königreich Dauer: 24 Okt. 2011 → 28 Okt. 2011 |
Abstract
Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.
ASJC Scopus Sachgebiete
- Entscheidungswissenschaften (insg.)
- Allgemeine Entscheidungswissenschaften
- Betriebswirtschaft, Management und Rechnungswesen (insg.)
- Allgemeine Unternehmensführung und Buchhaltung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management. New York: Association for Computing Machinery (ACM), 2011. S. 2237-2240.
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung
}
TY - GEN
T1 - Constructing Efficient Information Extraction Pipelines
AU - Wachsmuth, Henning
AU - Stein, Benno
AU - Engels, Gregor
PY - 2011/10
Y1 - 2011/10
N2 - Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.
AB - Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.
KW - information extraction
KW - run-time efficiency
UR - http://www.scopus.com/inward/record.url?scp=83055186740&partnerID=8YFLogxK
U2 - 10.1145/2063576.2063935
DO - 10.1145/2063576.2063935
M3 - Conference contribution
AN - SCOPUS:83055186740
SN - 9781450307178
SP - 2237
EP - 2240
BT - CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
PB - Association for Computing Machinery (ACM)
CY - New York
T2 - 20th ACM Conference on Information and Knowledge Management, CIKM'11
Y2 - 24 October 2011 through 28 October 2011
ER -