Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management |
Erscheinungsort | New York |
Herausgeber (Verlag) | Association for Computing Machinery (ACM) |
Seiten | 2049-2058 |
Seitenumfang | 10 |
ISBN (Print) | 9781450322638 |
Publikationsstatus | Veröffentlicht - 27 Okt. 2013 |
Extern publiziert | Ja |
Veranstaltung | 22nd ACM International Conference on Information and Knowledge Management - San Francisco, CA, USA / Vereinigte Staaten Dauer: 27 Okt. 2013 → 1 Nov. 2013 |
Abstract
Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).
ASJC Scopus Sachgebiete
- Entscheidungswissenschaften (insg.)
- Allgemeine Entscheidungswissenschaften
- Betriebswirtschaft, Management und Rechnungswesen (insg.)
- Allgemeine Unternehmensführung und Buchhaltung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. New York: Association for Computing Machinery (ACM), 2013. S. 2049-2058.
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung
}
TY - GEN
T1 - Information Extraction as a Filtering Task
AU - Wachsmuth, Henning
AU - Stein, Benno
AU - Engels, Gregor
PY - 2013/10/27
Y1 - 2013/10/27
N2 - Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).
AB - Information extraction is usually approached as an annotation task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spending annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view enables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. Copyright is held by the owner/author(s).
KW - Filtering
KW - Information extraction
KW - Relevance
KW - Run-time efficiency
KW - Truth maintenance
UR - http://www.scopus.com/inward/record.url?scp=84889566679&partnerID=8YFLogxK
U2 - 10.1145/2505515.2505557
DO - 10.1145/2505515.2505557
M3 - Conference contribution
AN - SCOPUS:84889566679
SN - 9781450322638
SP - 2049
EP - 2058
BT - CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
PB - Association for Computing Machinery (ACM)
CY - New York
T2 - 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
Y2 - 27 October 2013 through 1 November 2013
ER -