Towards Bias Detection in Online Text Corpora

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Christoph Hube
  • Robert Jäschke
  • Besnik Fetahu

Research Organisations

External Research Organisations

  • Humboldt-Universität zu Berlin (HU Berlin)
View graph of relations

Details

Original languageEnglish
Title of host publicationBias in Information, Algorithms, and Systems
Subtitle of host publicationProceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018)
Pages19-23
Number of pages5
Publication statusPublished - 2018
Event2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018 - Sheffield, United Kingdom (UK)
Duration: 25 Mar 201825 Mar 2018

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR Workshop Proceedings
Volume2103
ISSN (Print)1613-0073

Abstract

Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.

ASJC Scopus subject areas

Cite this

Towards Bias Detection in Online Text Corpora. / Hube, Christoph; Jäschke, Robert; Fetahu, Besnik.
Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). 2018. p. 19-23 (CEUR Workshop Proceedings; Vol. 2103).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Hube, C, Jäschke, R & Fetahu, B 2018, Towards Bias Detection in Online Text Corpora. in Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). CEUR Workshop Proceedings, vol. 2103, pp. 19-23, 2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018, Sheffield, United Kingdom (UK), 25 Mar 2018. <https://ceur-ws.org/Vol-2103/paper_4.pdf>
Hube, C., Jäschke, R., & Fetahu, B. (2018). Towards Bias Detection in Online Text Corpora. In Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018) (pp. 19-23). (CEUR Workshop Proceedings; Vol. 2103). https://ceur-ws.org/Vol-2103/paper_4.pdf
Hube C, Jäschke R, Fetahu B. Towards Bias Detection in Online Text Corpora. In Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). 2018. p. 19-23. (CEUR Workshop Proceedings).
Hube, Christoph ; Jäschke, Robert ; Fetahu, Besnik. / Towards Bias Detection in Online Text Corpora. Bias in Information, Algorithms, and Systems: Proceedings of the International Workshop on Bias in Information, Algorithms, and Systems co-located with 13th International Conference on Transforming Digital Worlds (iConference 2018). 2018. pp. 19-23 (CEUR Workshop Proceedings).
Download
@inproceedings{6277a8f8d7e74ef0a4b96e837428aae0,
title = "Towards Bias Detection in Online Text Corpora",
abstract = "Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.",
author = "Christoph Hube and Robert J{\"a}schke and Besnik Fetahu",
note = "Funding information: Acknowledgments This work is funded by the ERC Advanced Grant ALEXANDRIA (grant no. 339233), DESIR (grant no. 31081), and H2020 AFEL project (grant no. 687916).; 2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018 ; Conference date: 25-03-2018 Through 25-03-2018",
year = "2018",
language = "English",
series = "CEUR Workshop Proceedings",
publisher = "CEUR Workshop Proceedings",
pages = "19--23",
booktitle = "Bias in Information, Algorithms, and Systems",

}

Download

TY - GEN

T1 - Towards Bias Detection in Online Text Corpora

AU - Hube, Christoph

AU - Jäschke, Robert

AU - Fetahu, Besnik

N1 - Funding information: Acknowledgments This work is funded by the ERC Advanced Grant ALEXANDRIA (grant no. 339233), DESIR (grant no. 31081), and H2020 AFEL project (grant no. 687916).

PY - 2018

Y1 - 2018

N2 - Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.

AB - Natural language textual corpora depending on their genre, often contain bias which reect the point of view towards a subject of the original content creator. Even for sources like Wikipedia, a collaboratively created encyclopedia, which follows a Neutral Point of View (NPOV) policy, the pages therein are prone to such violations, this due to either: (i) Wikipedia contributors not being aware of NPOV policies or (ii) intentional push towards specific points of views. We present an approach for identifying bias words in online textual corpora using semantic relations of word vectors created through word2Vec. The bias word lists created by our approach help on identifying biased language in online texts.

UR - http://www.scopus.com/inward/record.url?scp=85048312610&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85048312610

T3 - CEUR Workshop Proceedings

SP - 19

EP - 23

BT - Bias in Information, Algorithms, and Systems

T2 - 2018 International Workshop on Bias in Information, Algorithms, and Systems, BIAS 2018

Y2 - 25 March 2018 through 25 March 2018

ER -