Towards Bias Detection in Online Text Corpora

Christoph Hube; Robert Jäschke; Besnik Fetahu

Details

Original language	English
Title of host publication	Bias in Information, Algorithms, and Systems
Subtitle of host publication	Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018)
Pages	19-23
Number of pages	5
Publication status	Published - 2018
Event	2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018 - Sheffield, United Kingdom (UK) Duration: 25 Mar 2018 → 25 Mar 2018

Publication series

Name	CEUR Workshop Proceedings
Publisher	CEUR Workshop Proceedings
Volume	2103
ISSN (Print)	1613-0073

Abstract

Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.

ASJC Scopus subject areas

Computer Science(all)
General Computer Science

Cite this

Towards Bias Detection in Online Text Corpora. / Hube, Christoph; Jäschke, Robert; Fetahu, Besnik.
Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). 2018. p. 19-23 (CEUR Workshop Proceedings; Vol. 2103).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Hube, C, Jäschke, R & Fetahu, B 2018, Towards Bias Detection in Online Text Corpora. in Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). CEUR Workshop Proceedings, vol. 2103, pp. 19-23, 2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018, Sheffield, United Kingdom (UK), 25 Mar 2018. <https://ceur-ws.org/Vol-2103/paper_4.pdf>

Hube, C., Jäschke, R., & Fetahu, B. (2018). Towards Bias Detection in Online Text Corpora. In Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018) (pp. 19-23). (CEUR Workshop Proceedings; Vol. 2103). https://ceur-ws.org/Vol-2103/paper_4.pdf

Hube C, Jäschke R, Fetahu B. Towards Bias Detection in Online Text Corpora. In Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). 2018. p. 19-23. (CEUR Workshop Proceedings).

Hube, Christoph ; Jäschke, Robert ; Fetahu, Besnik. / Towards Bias Detection in Online Text Corpora. Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). 2018. pp. 19-23 (CEUR Workshop Proceedings).

Download

@inproceedings{6277a8f8d7e74ef0a4b96e837428aae0,

title = "Towards Bias Detection in Online Text Corpora",

abstract = "Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.",

author = "Christoph Hube and Robert J{\"a}schke and Besnik Fetahu",

note = "Funding information: Acknowledgments This work is funded by the ERC Advanced Grant ALEXANDRIA (grant no. 339233), DESIR (grant no. 31081), and H2020 AFEL project (grant no. 687916).; 2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018 ; Conference date: 25-03-2018 Through 25-03-2018",

year = "2018",

language = "English",

series = "CEUR Workshop Proceedings",

publisher = "CEUR Workshop Proceedings",

pages = "19--23",

booktitle = "Bias in Information, Algorithms, and Systems",

}

Download

TY - GEN

T1 - Towards Bias Detection in Online Text Corpora

AU - Hube, Christoph

AU - Jäschke, Robert

AU - Fetahu, Besnik

N1 - Funding information: Acknowledgments This work is funded by the ERC Advanced Grant ALEXANDRIA (grant no. 339233), DESIR (grant no. 31081), and H2020 AFEL project (grant no. 687916).

PY - 2018

Y1 - 2018

N2 - Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.

AB - Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.

UR - http://www.scopus.com/inward/record.url?scp=85048312610&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85048312610

T3 - CEUR Workshop Proceedings

SP - 19

EP - 23

BT - Bias in Information, Algorithms, and Systems

T2 - 2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018

Y2 - 25 March 2018 through 25 March 2018

ER -

Research@Leibniz University

Towards Bias Detection in Online Text Corpora

Authors

Research Organisations

External Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this