Back to the Roots of Genres: Text Classification by Language Function

Henning Wachsmuth; Kathrin Bujna

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of the 5th International Joint Conference on Natural Language Processing
Herausgeber/-innen	Haifeng Wang, David Yarowsky
Seiten	632-640
Seitenumfang	9
ISBN (elektronisch)	9789744665645
Publikationsstatus	Veröffentlicht - Nov. 2011
Extern publiziert	Ja
Veranstaltung	5th International Joint Conference on Natural Language Processing - Chiang Mai, Thailand Dauer: 8 Nov. 2011 → 13 Nov. 2011

Abstract

The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

ASJC Scopus Sachgebiete

Geisteswissenschaftliche Fächer (insg.)
Sprache und Linguistik
Informatik (insg.)
Artificial intelligence
Informatik (insg.)
Software
Sozialwissenschaften (insg.)
Linguistik und Sprache

Zitieren

Back to the Roots of Genres: Text Classification by Language Function. / Wachsmuth, Henning; Bujna, Kathrin.
Proceedings of the 5th International Joint Conference on Natural Language Processing. Hrsg. / Haifeng Wang; David Yarowsky. 2011. S. 632-640.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung

Wachsmuth, H & Bujna, K 2011, Back to the Roots of Genres: Text Classification by Language Function. in H Wang & D Yarowsky (Hrsg.), Proceedings of the 5th International Joint Conference on Natural Language Processing. S. 632-640, 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 8 Nov. 2011. <https://aclanthology.org/I11-1071.pdf>

Wachsmuth, H., & Bujna, K. (2011). Back to the Roots of Genres: Text Classification by Language Function. In H. Wang, & D. Yarowsky (Hrsg.), Proceedings of the 5th International Joint Conference on Natural Language Processing (S. 632-640) https://aclanthology.org/I11-1071.pdf

Wachsmuth H, Bujna K. Back to the Roots of Genres: Text Classification by Language Function. in Wang H, Yarowsky D, Hrsg., Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011. S. 632-640

Wachsmuth, Henning ; Bujna, Kathrin. / Back to the Roots of Genres : Text Classification by Language Function. Proceedings of the 5th International Joint Conference on Natural Language Processing. Hrsg. / Haifeng Wang ; David Yarowsky. 2011. S. 632-640

Download

@inproceedings{41358c80f81f46a6a6af36f02ec78b04,

title = "Back to the Roots of Genres: Text Classification by Language Function",

abstract = "The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.",

author = "Henning Wachsmuth and Kathrin Bujna",

note = "Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.; 5th International Joint Conference on Natural Language Processing, IJCNLP 2011 ; Conference date: 08-11-2011 Through 13-11-2011",

year = "2011",

month = nov,

language = "English",

pages = "632--640",

editor = "Haifeng Wang and David Yarowsky",

booktitle = "Proceedings of the 5th International Joint Conference on Natural Language Processing",

}

Download

TY - GEN

T1 - Back to the Roots of Genres

T2 - 5th International Joint Conference on Natural Language Processing, IJCNLP 2011

AU - Wachsmuth, Henning

AU - Bujna, Kathrin

N1 - Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.

PY - 2011/11

Y1 - 2011/11

N2 - The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

AB - The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

UR - http://www.scopus.com/inward/record.url?scp=85041137690&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85041137690

SP - 632

EP - 640

BT - Proceedings of the 5th International Joint Conference on Natural Language Processing

A2 - Wang, Haifeng

A2 - Yarowsky, David

Y2 - 8 November 2011 through 13 November 2011

ER -

Research@Leibniz University

Back to the Roots of Genres: Text Classification by Language Function

Autoren

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

Analyzing the Use of Metaphors in News Editorials for Political Framing

A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality

Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation

Modeling the Quality of Dialogical Explanations