Training Deep Neural Networks with 8-bit Floating Point Numbers

Wang, Naigang; Choi, Jungwook; Brand, Daniel; Chen, Chia-Yu; Gopalakrishnan, Kailash

Computer Science > Machine Learning

arXiv:1812.08011 (cs)

[Submitted on 19 Dec 2018]

Title:Training Deep Neural Networks with 8-bit Floating Point Numbers

Authors:Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, Kailash Gopalakrishnan

View PDF

Abstract:The state-of-the-art hardware platforms for training Deep Neural Networks (DNNs) are moving from traditional single precision (32-bit) computations towards 16 bits of precision -- in large part due to the high energy efficiency and smaller bit storage associated with using reduced-precision representations. However, unlike inference, training with numbers represented with less than 16 bits has been challenging due to the need to maintain fidelity of the gradient computations during back-propagation. Here we demonstrate, for the first time, the successful training of DNNs using 8-bit floating point numbers while fully maintaining the accuracy on a spectrum of Deep Learning models and datasets. In addition to reducing the data and computation precision to 8 bits, we also successfully reduce the arithmetic precision for additions (used in partial product accumulation and weight updates) from 32 bits to 16 bits through the introduction of a number of key ideas including chunk-based accumulation and floating point stochastic rounding. The use of these novel techniques lays the foundation for a new generation of hardware training platforms with the potential for 2-4x improved throughput over today's systems.

Comments:	NeurIPS 2018 (12 pages)
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1812.08011 [cs.LG]
	(or arXiv:1812.08011v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1812.08011

Submission history

From: Naigang Wang [view email]
[v1] Wed, 19 Dec 2018 15:15:55 UTC (750 KB)

Computer Science > Machine Learning

Title:Training Deep Neural Networks with 8-bit Floating Point Numbers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Training Deep Neural Networks with 8-bit Floating Point Numbers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators