Learning Visually Grounded Domain Ontologies via Embodied Conversation and Explanation

Park, Jonghyuk; Lascarides, Alex; Ramamoorthy, Subramanian

Computer Science > Artificial Intelligence

arXiv:2412.09770 (cs)

[Submitted on 13 Dec 2024]

Title:Learning Visually Grounded Domain Ontologies via Embodied Conversation and Explanation

Authors:Jonghyuk Park, Alex Lascarides, Subramanian Ramamoorthy

View PDF HTML (experimental)

Abstract:In this paper, we offer a learning framework in which the agent's knowledge gaps are overcome through corrective feedback from a teacher whenever the agent explains its (incorrect) predictions. We test it in a low-resource visual processing scenario, in which the agent must learn to recognize distinct types of toy truck. The agent starts the learning process with no ontology about what types of trucks exist nor which parts they have, and a deficient model for recognizing those parts from visual input. The teacher's feedback to the agent's explanations addresses its lack of relevant knowledge in the ontology via a generic rule (e.g., "dump trucks have dumpers"), whereas an inaccurate part recognition is corrected by a deictic statement (e.g., "this is not a dumper"). The learner utilizes this feedback not only to improve its estimate of the hypothesis space of possible domain ontologies and probability distributions over them, but also to use those estimates to update its visual interpretation of the scene. Our experiments demonstrate that teacher-learner pairs utilizing explanations and corrections are more data-efficient than those without such a faculty.

Comments:	Accepted to, and to appear in the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25)
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2412.09770 [cs.AI]
	(or arXiv:2412.09770v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2412.09770

Submission history

From: Jonghyuk Park [view email]
[v1] Fri, 13 Dec 2024 00:28:21 UTC (921 KB)

Computer Science > Artificial Intelligence

Title:Learning Visually Grounded Domain Ontologies via Embodied Conversation and Explanation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Learning Visually Grounded Domain Ontologies via Embodied Conversation and Explanation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators