Extracting Cultural Commonsense Knowledge at Scale

Nguyen, Tuan-Phong; Razniewski, Simon; Varde, Aparna; Weikum, Gerhard

doi:10.1145/3543507.3583535

Computer Science > Computation and Language

arXiv:2210.07763 (cs)

[Submitted on 14 Oct 2022 (v1), last revised 10 May 2023 (this version, v3)]

Title:Extracting Cultural Commonsense Knowledge at Scale

Authors:Tuan-Phong Nguyen, Simon Razniewski, Aparna Varde, Gerhard Weikum

View PDF

Abstract:Structured knowledge is important for many AI applications. Commonsense knowledge, which is crucial for robust human-centric AI, is covered by a small number of structured knowledge projects. However, they lack knowledge about human traits and behaviors conditioned on socio-cultural contexts, which is crucial for situative AI. This paper presents CANDLE, an end-to-end methodology for extracting high-quality cultural commonsense knowledge (CCSK) at scale. CANDLE extracts CCSK assertions from a huge web corpus and organizes them into coherent clusters, for 3 domains of subjects (geography, religion, occupation) and several cultural facets (food, drinks, clothing, traditions, rituals, behaviors). CANDLE includes judicious techniques for classification-based filtering and scoring of interestingness. Experimental evaluations show the superiority of the CANDLE CCSK collection over prior works, and an extrinsic use case demonstrates the benefits of CCSK for the GPT-3 language model. Code and data can be accessed at this https URL.

Comments:	11 pages, 6 figures, 10 tables
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2210.07763 [cs.CL]
	(or arXiv:2210.07763v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.07763
Journal reference:	ACM Web Conference 2023
Related DOI:	https://doi.org/10.1145/3543507.3583535

Submission history

From: Tuan-Phong Nguyen [view email]
[v1] Fri, 14 Oct 2022 12:53:57 UTC (431 KB)
[v2] Mon, 13 Feb 2023 08:00:22 UTC (556 KB)
[v3] Wed, 10 May 2023 12:35:06 UTC (415 KB)

Computer Science > Computation and Language

Title:Extracting Cultural Commonsense Knowledge at Scale

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Extracting Cultural Commonsense Knowledge at Scale

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators