Details
Original language | English |
---|---|
Title of host publication | Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 |
Editors | Daniel Tapias, Irene Russo, Olivier Hamon, Stelios Piperidis, Nicoletta Calzolari, Khalid Choukri, Joseph Mariani, Helene Mazo, Bente Maegaard, Jan Odijk, Mike Rosner |
Pages | 2657-2661 |
Number of pages | 5 |
ISBN (electronic) | 2951740867, 9782951740860 |
Publication status | Published - 1 Jan 2010 |
Event | 7th International Conference on Language Resources and Evaluation, LREC 2010 - Valletta, Malta Duration: 17 May 2010 → 23 May 2010 |
Publication series
Name | Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 |
---|
Abstract
Textual entailment has been recognized as a generic task that captures major semantic inference needs across many natural language processing applications. To date, textual entailment has not been considered in a cross-corpus setting, nor for user generated content. The emergence of Medicine 2.0, has made medical blogs an increasingly accepted source of information; but given the characteristics of blogs (which tend to be noisy and informal; or contain a interspersing of subjective and factual sentences) a potentially large amount of irrelevant information may be present. Considering this potential noise, the overarching problem with respect to information extraction from social media for medical intelligence gathering, is achieving the correct level of sentence filtering - as opposed to document or blog post level. In this paper, we propose an approach to textual entailment which uses the text from one source of user generated content (T text) for sentence-level filtering within a new and less amenable one (H text), when the underlying domain, tasks or semantic information is the same, or overlaps.
ASJC Scopus subject areas
- Social Sciences(all)
- Education
- Social Sciences(all)
- Library and Information Sciences
- Social Sciences(all)
- Linguistics and Language
- Arts and Humanities(all)
- Language and Linguistics
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010. ed. / Daniel Tapias; Irene Russo; Olivier Hamon; Stelios Piperidis; Nicoletta Calzolari; Khalid Choukri; Joseph Mariani; Helene Mazo; Bente Maegaard; Jan Odijk; Mike Rosner. 2010. p. 2657-2661 (Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Cross-Corpus Textual Entailement for Sublanguage Analysis in Epidemic Intelligence
AU - Stewart, Avaré
AU - Denecke, Kerstin
AU - Nejdl, Wolfgang
PY - 2010/1/1
Y1 - 2010/1/1
N2 - Textual entailment has been recognized as a generic task that captures major semantic inference needs across many natural language processing applications. To date, textual entailment has not been considered in a cross-corpus setting, nor for user generated content. The emergence of Medicine 2.0, has made medical blogs an increasingly accepted source of information; but given the characteristics of blogs (which tend to be noisy and informal; or contain a interspersing of subjective and factual sentences) a potentially large amount of irrelevant information may be present. Considering this potential noise, the overarching problem with respect to information extraction from social media for medical intelligence gathering, is achieving the correct level of sentence filtering - as opposed to document or blog post level. In this paper, we propose an approach to textual entailment which uses the text from one source of user generated content (T text) for sentence-level filtering within a new and less amenable one (H text), when the underlying domain, tasks or semantic information is the same, or overlaps.
AB - Textual entailment has been recognized as a generic task that captures major semantic inference needs across many natural language processing applications. To date, textual entailment has not been considered in a cross-corpus setting, nor for user generated content. The emergence of Medicine 2.0, has made medical blogs an increasingly accepted source of information; but given the characteristics of blogs (which tend to be noisy and informal; or contain a interspersing of subjective and factual sentences) a potentially large amount of irrelevant information may be present. Considering this potential noise, the overarching problem with respect to information extraction from social media for medical intelligence gathering, is achieving the correct level of sentence filtering - as opposed to document or blog post level. In this paper, we propose an approach to textual entailment which uses the text from one source of user generated content (T text) for sentence-level filtering within a new and less amenable one (H text), when the underlying domain, tasks or semantic information is the same, or overlaps.
UR - http://www.scopus.com/inward/record.url?scp=85037528088&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85037528088
T3 - Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010
SP - 2657
EP - 2661
BT - Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010
A2 - Tapias, Daniel
A2 - Russo, Irene
A2 - Hamon, Olivier
A2 - Piperidis, Stelios
A2 - Calzolari, Nicoletta
A2 - Choukri, Khalid
A2 - Mariani, Joseph
A2 - Mazo, Helene
A2 - Maegaard, Bente
A2 - Odijk, Jan
A2 - Rosner, Mike
T2 - 7th International Conference on Language Resources and Evaluation, LREC 2010
Y2 - 17 May 2010 through 23 May 2010
ER -