Information Extraction as a Filtering Task

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschung

Autoren

Externe Organisationen

  • Universität Paderborn
  • Bauhaus-Universität Weimar
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksCIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
ErscheinungsortNew York
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten2049-2058
Seitenumfang10
ISBN (Print)9781450322638
PublikationsstatusVeröffentlicht - 27 Okt. 2013
Extern publiziertJa
Veranstaltung22nd ACM International Conference on Information and Knowledge Management - San Francisco, CA, USA / Vereinigte Staaten
Dauer: 27 Okt. 20131 Nov. 2013

Abstract

Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).

ASJC Scopus Sachgebiete

Zitieren

Information Extraction as a Filtering Task. / Wachsmuth, Henning; Stein, Benno; Engels, Gregor.
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. New York: Association for Computing Machinery (ACM), 2013. S. 2049-2058.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschung

Wachsmuth, H, Stein, B & Engels, G 2013, Information Extraction as a Filtering Task. in CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. Association for Computing Machinery (ACM), New York, S. 2049-2058, 22nd ACM International Conference on Information and Knowledge Management, San Francisco, CA, USA / Vereinigte Staaten, 27 Okt. 2013. https://doi.org/10.1145/2505515.2505557
Wachsmuth, H., Stein, B., & Engels, G. (2013). Information Extraction as a Filtering Task. In CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management (S. 2049-2058). Association for Computing Machinery (ACM). https://doi.org/10.1145/2505515.2505557
Wachsmuth H, Stein B, Engels G. Information Extraction as a Filtering Task. in CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. New York: Association for Computing Machinery (ACM). 2013. S. 2049-2058 doi: 10.1145/2505515.2505557
Wachsmuth, Henning ; Stein, Benno ; Engels, Gregor. / Information Extraction as a Filtering Task. CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. New York : Association for Computing Machinery (ACM), 2013. S. 2049-2058
Download
@inproceedings{d4f0797e60284448ae48e67f8398743e,
title = "Information Extraction as a Filtering Task",
abstract = "Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).",
keywords = "Filtering, Information extraction, Relevance, Run-time efficiency, Truth maintenance",
author = "Henning Wachsmuth and Benno Stein and Gregor Engels",
year = "2013",
month = oct,
day = "27",
doi = "10.1145/2505515.2505557",
language = "English",
isbn = "9781450322638",
pages = "2049--2058",
booktitle = "CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management",
publisher = "Association for Computing Machinery (ACM)",
address = "United States",
note = "22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 ; Conference date: 27-10-2013 Through 01-11-2013",

}

Download

TY - GEN

T1 - Information Extraction as a Filtering Task

AU - Wachsmuth, Henning

AU - Stein, Benno

AU - Engels, Gregor

PY - 2013/10/27

Y1 - 2013/10/27

N2 - Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).

AB - Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).

KW - Filtering

KW - Information extraction

KW - Relevance

KW - Run-time efficiency

KW - Truth maintenance

UR - http://www.scopus.com/inward/record.url?scp=84889566679&partnerID=8YFLogxK

U2 - 10.1145/2505515.2505557

DO - 10.1145/2505515.2505557

M3 - Conference contribution

AN - SCOPUS:84889566679

SN - 9781450322638

SP - 2049

EP - 2058

BT - CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

PB - Association for Computing Machinery (ACM)

CY - New York

T2 - 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013

Y2 - 27 October 2013 through 1 November 2013

ER -

Von denselben Autoren