TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification

Niemeijer, Joshua; Ehrhardt, Jan; Uzunova, Hristina; Handels, Heinz

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.17473 (cs)

[Submitted on 25 Jun 2024]

Title:TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification

Authors:Joshua Niemeijer, Jan Ehrhardt, Hristina Uzunova, Heinz Handels

View PDF HTML (experimental)

Abstract:The usage of medical image data for the training of large-scale machine learning approaches is particularly challenging due to its scarce availability and the costly generation of data annotations, typically requiring the engagement of medical professionals. The rapid development of generative models allows towards tackling this problem by leveraging large amounts of realistic synthetically generated data for the training process. However, randomly choosing synthetic samples, might not be an optimal strategy.
In this work, we investigate the targeted generation of synthetic training data, in order to improve the accuracy and robustness of image classification. Therefore, our approach aims to guide the generative model to synthesize data with high epistemic uncertainty, since large measures of epistemic uncertainty indicate underrepresented data points in the training set. During the image generation we feed images reconstructed by an auto encoder into the classifier and compute the mutual information over the class-probability distribution as a measure for uncertainty.We alter the feature space of the autoencoder through an optimization process with the objective of maximizing the classifier uncertainty on the decoded image. By training on such data we improve the performance and robustness against test time data augmentations and adversarial attacks on several classifications tasks.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2406.17473 [cs.CV]
	(or arXiv:2406.17473v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.17473

Submission history

From: Joshua Niemeijer [view email]
[v1] Tue, 25 Jun 2024 11:38:46 UTC (7,792 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators