Back to the Roots of Genres: Text Classification by Language Function

Henning Wachsmuth; Kathrin Bujna

Details

Original language	English
Title of host publication	Proceedings of the 5th International Joint Conference on Natural Language Processing
Editors	Haifeng Wang, David Yarowsky
Pages	632-640
Number of pages	9
ISBN (electronic)	9789744665645
Publication status	Published - Nov 2011
Externally published	Yes
Event	5th International Joint Conference on Natural Language Processing, IJCNLP 2011 - Chiang Mai, Thailand Duration: 8 Nov 2011 → 13 Nov 2011

Abstract

The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

ASJC Scopus subject areas

Arts and Humanities(all)
Language and Linguistics
Computer Science(all)
Artificial Intelligence
Computer Science(all)
Software
Social Sciences(all)
Linguistics and Language

Cite this

Back to the Roots of Genres: Text Classification by Language Function. / Wachsmuth, Henning; Bujna, Kathrin.
Proceedings of the 5th International Joint Conference on Natural Language Processing. ed. / Haifeng Wang; David Yarowsky. 2011. p. 632-640.

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research

Wachsmuth, H & Bujna, K 2011, Back to the Roots of Genres: Text Classification by Language Function. in H Wang & D Yarowsky (eds), Proceedings of the 5th International Joint Conference on Natural Language Processing. pp. 632-640, 5th International Joint Conference on Natural Language Processing, IJCNLP 2011, Chiang Mai, Thailand, 8 Nov 2011. <https://aclanthology.org/I11-1071.pdf>

Wachsmuth, H., & Bujna, K. (2011). Back to the Roots of Genres: Text Classification by Language Function. In H. Wang, & D. Yarowsky (Eds.), Proceedings of the 5th International Joint Conference on Natural Language Processing (pp. 632-640) https://aclanthology.org/I11-1071.pdf

Wachsmuth H, Bujna K. Back to the Roots of Genres: Text Classification by Language Function. In Wang H, Yarowsky D, editors, Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011. p. 632-640

Wachsmuth, Henning ; Bujna, Kathrin. / Back to the Roots of Genres : Text Classification by Language Function. Proceedings of the 5th International Joint Conference on Natural Language Processing. editor / Haifeng Wang ; David Yarowsky. 2011. pp. 632-640

Download

@inproceedings{41358c80f81f46a6a6af36f02ec78b04,

title = "Back to the Roots of Genres: Text Classification by Language Function",

abstract = "The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.",

author = "Henning Wachsmuth and Kathrin Bujna",

note = "Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.; 5th International Joint Conference on Natural Language Processing, IJCNLP 2011 ; Conference date: 08-11-2011 Through 13-11-2011",

year = "2011",

month = nov,

language = "English",

pages = "632--640",

editor = "Haifeng Wang and David Yarowsky",

booktitle = "Proceedings of the 5th International Joint Conference on Natural Language Processing",

}

Download

TY - GEN

T1 - Back to the Roots of Genres

T2 - 5th International Joint Conference on Natural Language Processing, IJCNLP 2011

AU - Wachsmuth, Henning

AU - Bujna, Kathrin

N1 - Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.

PY - 2011/11

Y1 - 2011/11

N2 - The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

AB - The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

UR - http://www.scopus.com/inward/record.url?scp=85041137690&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85041137690

SP - 632

EP - 640

BT - Proceedings of the 5th International Joint Conference on Natural Language Processing

A2 - Wang, Haifeng

A2 - Yarowsky, David

Y2 - 8 November 2011 through 13 November 2011

ER -

Research@Leibniz University

Back to the Roots of Genres: Text Classification by Language Function

Authors

External Research Organisations

Details

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

When to use a metaphor: Metaphors in dialogical explanations with addressees of different expertise

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback