Back to the Roots of Genres: Text Classification by Language Function

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschung

Autoren

Externe Organisationen

  • Universität Paderborn
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of the 5th International Joint Conference on Natural Language Processing
Herausgeber/-innenHaifeng Wang, David Yarowsky
Seiten632-640
Seitenumfang9
ISBN (elektronisch)9789744665645
PublikationsstatusVeröffentlicht - Nov. 2011
Extern publiziertJa
Veranstaltung5th International Joint Conference on Natural Language Processing - Chiang Mai, Thailand
Dauer: 8 Nov. 201113 Nov. 2011

Abstract

The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

ASJC Scopus Sachgebiete

Zitieren

Back to the Roots of Genres: Text Classification by Language Function. / Wachsmuth, Henning; Bujna, Kathrin.
Proceedings of the 5th International Joint Conference on Natural Language Processing. Hrsg. / Haifeng Wang; David Yarowsky. 2011. S. 632-640.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschung

Wachsmuth, H & Bujna, K 2011, Back to the Roots of Genres: Text Classification by Language Function. in H Wang & D Yarowsky (Hrsg.), Proceedings of the 5th International Joint Conference on Natural Language Processing. S. 632-640, 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 8 Nov. 2011. <https://aclanthology.org/I11-1071.pdf>
Wachsmuth, H., & Bujna, K. (2011). Back to the Roots of Genres: Text Classification by Language Function. In H. Wang, & D. Yarowsky (Hrsg.), Proceedings of the 5th International Joint Conference on Natural Language Processing (S. 632-640) https://aclanthology.org/I11-1071.pdf
Wachsmuth H, Bujna K. Back to the Roots of Genres: Text Classification by Language Function. in Wang H, Yarowsky D, Hrsg., Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011. S. 632-640
Wachsmuth, Henning ; Bujna, Kathrin. / Back to the Roots of Genres : Text Classification by Language Function. Proceedings of the 5th International Joint Conference on Natural Language Processing. Hrsg. / Haifeng Wang ; David Yarowsky. 2011. S. 632-640
Download
@inproceedings{41358c80f81f46a6a6af36f02ec78b04,
title = "Back to the Roots of Genres: Text Classification by Language Function",
abstract = "The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.",
author = "Henning Wachsmuth and Kathrin Bujna",
note = "Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.; 5th International Joint Conference on Natural Language Processing, IJCNLP 2011 ; Conference date: 08-11-2011 Through 13-11-2011",
year = "2011",
month = nov,
language = "English",
pages = "632--640",
editor = "Haifeng Wang and David Yarowsky",
booktitle = "Proceedings of the 5th International Joint Conference on Natural Language Processing",

}

Download

TY - GEN

T1 - Back to the Roots of Genres

T2 - 5th International Joint Conference on Natural Language Processing, IJCNLP 2011

AU - Wachsmuth, Henning

AU - Bujna, Kathrin

N1 - Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.

PY - 2011/11

Y1 - 2011/11

N2 - The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

AB - The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

UR - http://www.scopus.com/inward/record.url?scp=85041137690&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85041137690

SP - 632

EP - 640

BT - Proceedings of the 5th International Joint Conference on Natural Language Processing

A2 - Wang, Haifeng

A2 - Yarowsky, David

Y2 - 8 November 2011 through 13 November 2011

ER -

Von denselben Autoren