StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes

Deshpande, Awantee; Ruiter, Dana; Mosbach, Marius; Klakow, Dietrich

Computer Science > Computation and Language

arXiv:2205.14036 (cs)

[Submitted on 27 May 2022]

Title:StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes

Authors:Awantee Deshpande, Dana Ruiter, Marius Mosbach, Dietrich Klakow

View PDF

Abstract:Analyzing ethnic or religious bias is important for improving fairness, accountability, and transparency of natural language processing models. However, many techniques rely on human-compiled lists of bias terms, which are expensive to create and are limited in coverage. In this study, we present a fully data-driven pipeline for generating a knowledge graph (KG) of cultural knowledge and stereotypes. Our resulting KG covers 5 religious groups and 5 nationalities and can easily be extended to include more entities. Our human evaluation shows that the majority (59.2%) of non-singleton entries are coherent and complete stereotypes. We further show that performing intermediate masked language model training on the verbalized KG leads to a higher level of cultural awareness in the model and has the potential to increase classification performance on knowledge-crucial samples on a related task, i.e., hate speech detection.

Comments:	12 pages, 2 figures, accepted as a long paper at WOAH at NAACL 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.14036 [cs.CL]
	(or arXiv:2205.14036v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.14036

Submission history

From: Awantee Deshpande [view email]
[v1] Fri, 27 May 2022 15:09:56 UTC (95 KB)

Computer Science > Computation and Language

Title:StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators