An Effective Training Method For Deep Convolutional Neural Network

Jiang, Yang; Dou, Zeyang; Hao, Qun; Cao, Jie; Gao, Kun; Chen, Xi

Computer Science > Machine Learning

arXiv:1708.01666 (cs)

[Submitted on 31 Jul 2017 (v1), last revised 17 Oct 2017 (this version, v5)]

Title:An Effective Training Method For Deep Convolutional Neural Network

Authors:Yang Jiang, Zeyang Dou, Qun Hao, Jie Cao, Kun Gao, Xi Chen

View PDF

Abstract:In this paper, we propose the nonlinearity generation method to speed up and stabilize the training of deep convolutional neural networks. The proposed method modifies a family of activation functions as nonlinearity generators (NGs). NGs make the activation functions linear symmetric for their inputs to lower model capacity, and automatically introduce nonlinearity to enhance the capacity of the model during training. The proposed method can be considered an unusual form of regularization: the model parameters are obtained by training a relatively low-capacity model, that is relatively easy to optimize at the beginning, with only a few iterations, and these parameters are reused for the initialization of a higher-capacity model. We derive the upper and lower bounds of variance of the weight variation, and show that the initial symmetric structure of NGs helps stabilize training. We evaluate the proposed method on different frameworks of convolutional neural networks over two object recognition benchmark tasks (CIFAR-10 and CIFAR-100). Experimental results showed that the proposed method allows us to (1) speed up the convergence of training, (2) allow for less careful weight initialization, (3) improve or at least maintain the performance of the model at negligible extra computational cost, and (4) easily train a very deep model.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1708.01666 [cs.LG]
	(or arXiv:1708.01666v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1708.01666

Submission history

From: Jie Cao [view email]
[v1] Mon, 31 Jul 2017 23:19:03 UTC (268 KB)
[v2] Sun, 13 Aug 2017 14:41:04 UTC (268 KB)
[v3] Mon, 21 Aug 2017 15:45:11 UTC (268 KB)
[v4] Tue, 10 Oct 2017 08:58:03 UTC (570 KB)
[v5] Tue, 17 Oct 2017 15:53:20 UTC (767 KB)

Computer Science > Machine Learning

Title:An Effective Training Method For Deep Convolutional Neural Network

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An Effective Training Method For Deep Convolutional Neural Network

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators