Quantization for Rapid Deployment of Deep Neural Networks

Lee, Jun Haeng; Ha, Sangwon; Choi, Saerom; Lee, Won-Jo; Lee, Seungwon

Computer Science > Neural and Evolutionary Computing

arXiv:1810.05488 (cs)

[Submitted on 12 Oct 2018]

Title:Quantization for Rapid Deployment of Deep Neural Networks

Authors:Jun Haeng Lee, Sangwon Ha, Saerom Choi, Won-Jo Lee, Seungwon Lee

View PDF

Abstract:This paper aims at rapid deployment of the state-of-the-art deep neural networks (DNNs) to energy efficient accelerators without time-consuming fine tuning or the availability of the full datasets. Converting DNNs in full precision to limited precision is essential in taking advantage of the accelerators with reduced memory footprint and computation power. However, such a task is not trivial since it often requires the full training and validation datasets for profiling the network statistics and fine tuning the networks to recover the accuracy lost after quantization. To address these issues, we propose a simple method recognizing channel-level distribution to reduce the quantization-induced accuracy loss and minimize the required image samples for profiling. We evaluated our method on eleven networks trained on the ImageNet classification benchmark and a network trained on the Pascal VOC object detection benchmark. The results prove that the networks can be quantized into 8-bit integer precision without fine tuning.

Subjects:	Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1810.05488 [cs.NE]
	(or arXiv:1810.05488v1 [cs.NE] for this version)
	https://doi.org/10.48550/arXiv.1810.05488

Submission history

From: Jun Haeng Lee [view email]
[v1] Fri, 12 Oct 2018 13:06:49 UTC (217 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.NE

< prev | next >

new | recent | 2018-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jun Haeng Lee
Sangwon Ha
Saerom Choi
Won-Jo Lee
Seungwon Lee

export BibTeX citation

Computer Science > Neural and Evolutionary Computing

Title:Quantization for Rapid Deployment of Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Neural and Evolutionary Computing

Title:Quantization for Rapid Deployment of Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators