Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training

Chen, Boyu; Lu, Wenlian; Fokoue, Ernest

Computer Science > Machine Learning

arXiv:1805.08462 (cs)

[Submitted on 22 May 2018 (v1), last revised 7 Sep 2018 (this version, v2)]

Title:Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training

Authors:Boyu Chen, Wenlian Lu, Ernest Fokoue

View PDF

Abstract:Meta-learning is a promising method to achieve efficient training method towards deep neural net and has been attracting increases interests in recent years. But most of the current methods are still not capable to train complex neuron net model with long-time training process. In this paper, a novel second-order meta-optimizer, named Meta-learning with Hessian-Free(MLHF) approach, is proposed based on the Hessian-Free approach. Two recurrent neural networks are established to generate the damping and the precondition matrix of this Hessian-Free framework. A series of techniques to meta-train the MLHF towards stable and reinforce the meta-training of this optimizer, including the gradient calculation of $H$. Numerical experiments on deep convolution neural nets, including CUDA-convnet and ResNet18(v2), with datasets of CIFAR10 and ILSVRC2012, indicate that the MLHF shows good and continuous training performance during the whole long-time training process, i.e., both the rapid-decreasing early stage and the steadily-deceasing later stage, and so is a promising meta-learning framework towards elevating the training efficiency in real-world deep neural nets.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1805.08462 [cs.LG]
	(or arXiv:1805.08462v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1805.08462

Submission history

From: Boyu Chen [view email]
[v1] Tue, 22 May 2018 09:04:52 UTC (789 KB)
[v2] Fri, 7 Sep 2018 06:14:05 UTC (1,070 KB)

Computer Science > Machine Learning

Title:Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators