On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process

Adhane, Gereziher; Dehshibi, Mohammad Mahdi; Vetter, Dennis; Masip, David; Roig, Gemma

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.13943 (cs)

[Submitted on 18 Dec 2024]

Title:On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process

Authors:Gereziher Adhane, Mohammad Mahdi Dehshibi, Dennis Vetter, David Masip, Gemma Roig

View PDF HTML (experimental)

Abstract:Knowledge distillation (KD) remains challenging due to the opaque nature of the knowledge transfer process from a Teacher to a Student, making it difficult to address certain issues related to KD. To address this, we proposed UniCAM, a novel gradient-based visual explanation method, which effectively interprets the knowledge learned during KD. Our experimental results demonstrate that with the guidance of the Teacher's knowledge, the Student model becomes more efficient, learning more relevant features while discarding those that are not relevant. We refer to the features learned with the Teacher's guidance as distilled features and the features irrelevant to the task and ignored by the Student as residual features. Distilled features focus on key aspects of the input, such as textures and parts of objects. In contrast, residual features demonstrate more diffused attention, often targeting irrelevant areas, including the backgrounds of the target objects. In addition, we proposed two novel metrics: the feature similarity score (FSS) and the relevance score (RS), which quantify the relevance of the distilled knowledge. Experiments on the CIFAR10, ASIRRA, and Plant Disease datasets demonstrate that UniCAM and the two metrics offer valuable insights to explain the KD process.

Comments:	Accepted to 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV'25). Includes 5 pages of supplementary material
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2412.13943 [cs.CV]
	(or arXiv:2412.13943v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.13943

Submission history

From: Mohammad Mahdi Dehshibi Dr. [view email]
[v1] Wed, 18 Dec 2024 15:25:36 UTC (6,379 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators