Interpreting recurrent neural networks behaviour via excitable network attractors

Ceni, Andrea; Ashwin, Peter; Livi, Lorenzo

doi:10.1007/s12559-019-09634-2

Computer Science > Machine Learning

arXiv:1807.10478 (cs)

[Submitted on 27 Jul 2018 (v1), last revised 10 Mar 2019 (this version, v6)]

Title:Interpreting recurrent neural networks behaviour via excitable network attractors

Authors:Andrea Ceni, Peter Ashwin, Lorenzo Livi

View PDF

Abstract:Introduction: Machine learning provides fundamental tools both for scientific research and for the development of technologies with significant impact on society. It provides methods that facilitate the discovery of regularities in data and that give predictions without explicit knowledge of the rules governing a system. However, a price is paid for exploiting such flexibility: machine learning methods are typically black-boxes where it is difficult to fully understand what the machine is doing or how it is operating. This poses constraints on the applicability and explainability of such methods. Methods: Our research aims to open the black-box of recurrent neural networks, an important family of neural networks used for processing sequential data. We propose a novel methodology that provides a mechanistic interpretation of behaviour when solving a computational task. Our methodology uses mathematical constructs called excitable network attractors, which are invariant sets in phase space composed of stable attractors and excitable connections between them. Results and Discussion: As the behaviour of recurrent neural networks depends both on training and on inputs to the system, we introduce an algorithm to extract network attractors directly from the trajectory of a neural network while solving tasks. Simulations conducted on a controlled benchmark task confirm the relevance of these attractors for interpreting the behaviour of recurrent neural networks, at least for tasks that involve learning a finite number of stable states and transitions between them.

Comments:	revised version
Subjects:	Machine Learning (cs.LG); Dynamical Systems (math.DS); Machine Learning (stat.ML)
Cite as:	arXiv:1807.10478 [cs.LG]
	(or arXiv:1807.10478v6 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1807.10478
Related DOI:	https://doi.org/10.1007/s12559-019-09634-2

Submission history

From: Lorenzo Livi [view email]
[v1] Fri, 27 Jul 2018 08:02:45 UTC (6,302 KB)
[v2] Tue, 31 Jul 2018 10:02:41 UTC (6,302 KB)
[v3] Sat, 10 Nov 2018 16:24:50 UTC (5,833 KB)
[v4] Thu, 29 Nov 2018 11:38:47 UTC (5,798 KB)
[v5] Tue, 19 Feb 2019 16:21:30 UTC (5,909 KB)
[v6] Sun, 10 Mar 2019 09:35:27 UTC (5,820 KB)

Computer Science > Machine Learning

Title:Interpreting recurrent neural networks behaviour via excitable network attractors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Interpreting recurrent neural networks behaviour via excitable network attractors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators