Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory

Anagnostopoulou, Aliki; Hartmann, Mareike; Sonntag, Daniel

Computer Science > Computation and Language

arXiv:2306.03500 (cs)

[Submitted on 6 Jun 2023]

Title:Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory

Authors:Aliki Anagnostopoulou, Mareike Hartmann, Daniel Sonntag

View PDF

Abstract:Interactive machine learning (IML) is a beneficial learning paradigm in cases of limited data availability, as human feedback is incrementally integrated into the training process. In this paper, we present an IML pipeline for image captioning which allows us to incrementally adapt a pre-trained image captioning model to a new data distribution based on user input. In order to incorporate user input into the model, we explore the use of a combination of simple data augmentation methods to obtain larger data batches for each newly annotated data instance and implement continual learning methods to prevent catastrophic forgetting from repeated updates. For our experiments, we split a domain-specific image captioning dataset, namely VizWiz, into non-overlapping parts to simulate an incremental input flow for continually adapting the model to new data. We find that, while data augmentation worsens results, even when relatively small amounts of data are available, episodic memory is an effective strategy to retain knowledge from previously seen clusters.

Subjects:	Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.03500 [cs.CL]
	(or arXiv:2306.03500v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.03500

Submission history

From: Aliki Anagnostopoulou [view email]
[v1] Tue, 6 Jun 2023 08:38:10 UTC (2,369 KB)

Computer Science > Computation and Language

Title:Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators