Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning

Zhang, Yichi; Ou, Zhijian

Statistics > Machine Learning

arXiv:1803.00184 (stat)

[Submitted on 1 Mar 2018 (v1), last revised 23 May 2018 (this version, v3)]

Title:Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning

Authors:Yichi Zhang, Zhijian Ou

View PDF

Abstract:An ensemble of neural networks is known to be more robust and accurate than an individual network, however usually with linearly-increased cost in both training and testing. In this work, we propose a two-stage method to learn Sparse Structured Ensembles (SSEs) for neural networks. In the first stage, we run SG-MCMC with group sparse priors to draw an ensemble of samples from the posterior distribution of network parameters. In the second stage, we apply weight-pruning to each sampled network and then perform retraining over the remained connections. In this way of learning SSEs with SG-MCMC and pruning, we not only achieve high prediction accuracy since SG-MCMC enhances exploration of the model-parameter space, but also reduce memory and computation cost significantly in both training and testing of NN ensembles. This is thoroughly evaluated in the experiments of learning SSE ensembles of both FNNs and LSTMs. For example, in LSTM based language modeling (LM), we obtain 21% relative reduction in LM perplexity by learning a SSE of 4 large LSTM models, which has only 30% of model parameters and 70% of computations in total, as compared to the baseline large LSTM LM. To the best of our knowledge, this work represents the first methodology and empirical study of integrating SG-MCMC, group sparse prior and network pruning together for learning NN ensembles.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1803.00184 [stat.ML]
	(or arXiv:1803.00184v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1803.00184

Submission history

From: Yichi Zhang [view email]
[v1] Thu, 1 Mar 2018 03:03:53 UTC (818 KB)
[v2] Fri, 2 Mar 2018 09:43:05 UTC (818 KB)
[v3] Wed, 23 May 2018 08:28:20 UTC (818 KB)

Statistics > Machine Learning

Title:Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators