Ensemble Deep Learning on Large, Mixed-Site fMRI Datasets in Autism and Other Tasks

Leming, Matthew; Górriz, Juan Manuel; Suckling, John

doi:10.1142/S0129065720500124

Quantitative Biology > Quantitative Methods

arXiv:2002.07874v1 (q-bio)

[Submitted on 14 Feb 2020 (this version), latest version 27 May 2020 (v2)]

Title:Ensemble Deep Learning on Large, Mixed-Site fMRI Datasets in Autism and Other Tasks

Authors:Matthew Leming, Juan Manuel Górriz, John Suckling

View PDF

Abstract:Deep learning models for MRI classification face two recurring problems: they are typically limited by low sample size, and are abstracted by their own complexity (the "black box problem"). In this paper, we train a convolutional neural network (CNN) with the largest multi-source, functional MRI (fMRI) connectomic dataset ever compiled, consisting of 43,858 datapoints. We apply this model to a cross-sectional comparison of autism (ASD) vs typically developing (TD) controls that has proved difficult to characterise with inferential statistics. To contextualise these findings, we additionally perform classifications of gender and task vs rest. Employing class-balancing to build a training set, we trained 3$\times$300 modified CNNs in an ensemble model to classify fMRI connectivity matrices with overall AUROCs of 0.6774, 0.7680, and 0.9222 for ASD vs TD, gender, and task vs rest, respectively. Additionally, we aim to address the black box problem in this context using two visualization methods. First, class activation maps show which functional connections of the brain our models focus on when performing classification. Second, by analyzing maximal activations of the hidden layers, we were also able to explore how the model organizes a large and mixed-centre dataset, finding that it dedicates specific areas of its hidden layers to processing different covariates of data (depending on the independent variable analyzed), and other areas to mix data from different sources. Our study finds that deep learning models that distinguish ASD from TD controls focus broadly on temporal and cerebellar connections, with a particularly high focus on the right caudate nucleus and paracentral sulcus.

Subjects:	Quantitative Methods (q-bio.QM); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
Cite as:	arXiv:2002.07874 [q-bio.QM]
	(or arXiv:2002.07874v1 [q-bio.QM] for this version)
	https://doi.org/10.48550/arXiv.2002.07874
Related DOI:	https://doi.org/10.1142/S0129065720500124

Submission history

From: Matthew Leming [view email]
[v1] Fri, 14 Feb 2020 17:28:16 UTC (3,820 KB)
[v2] Wed, 27 May 2020 16:31:37 UTC (3,820 KB)

Quantitative Biology > Quantitative Methods

Title:Ensemble Deep Learning on Large, Mixed-Site fMRI Datasets in Autism and Other Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Quantitative Methods

Title:Ensemble Deep Learning on Large, Mixed-Site fMRI Datasets in Autism and Other Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators