Dataset Size Recovery from LoRA Weights

Salama, Mohammad; Kahana, Jonathan; Horwitz, Eliahu; Hoshen, Yedid

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.19395 (cs)

[Submitted on 27 Jun 2024]

Title:Dataset Size Recovery from LoRA Weights

Authors:Mohammad Salama, Jonathan Kahana, Eliahu Horwitz, Yedid Hoshen

View PDF HTML (experimental)

Abstract:Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.19395 [cs.CV]
	(or arXiv:2406.19395v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.19395

Submission history

From: Mohammad Salama [view email]
[v1] Thu, 27 Jun 2024 17:59:53 UTC (727 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Dataset Size Recovery from LoRA Weights

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Dataset Size Recovery from LoRA Weights

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators