DistDD: Distributed Data Distillation Aggregation through Gradient Matching

Wang, Peiran; Wang, Haohan

Abstract:In this paper, we introduce DistDD, a novel approach within the federated learning framework that reduces the need for repetitive communication by distilling data directly on clients' devices. Unlike traditional federated learning that requires iterative model updates across nodes, DistDD facilitates a one-time distillation process that extracts a global distilled dataset, maintaining the privacy standards of federated learning while significantly cutting down communication costs. By leveraging the DistDD's distilled dataset, the developers of the FL can achieve just-in-time parameter tuning and neural architecture search over FL without repeating the whole FL process multiple times. We provide a detailed convergence proof of the DistDD algorithm, reinforcing its mathematical stability and reliability for practical applications. Our experiments demonstrate the effectiveness and robustness of DistDD, particularly in non-i.i.d. and mislabeled data scenarios, showcasing its potential to handle complex real-world data challenges distinctively from conventional federated learning methods. We also evaluate DistDD's application in the use case and prove its effectiveness and communication-savings in the NAS use case.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2410.08665 [cs.LG]
	(or arXiv:2410.08665v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.08665

Computer Science > Machine Learning

Title:DistDD: Distributed Data Distillation Aggregation through Gradient Matching

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators