AdderNet: Do We Really Need Multiplications in Deep Learning?

Chen, Hanting; Wang, Yunhe; Xu, Chunjing; Shi, Boxin; Xu, Chao; Tian, Qi; Xu, Chang

Computer Science > Computer Vision and Pattern Recognition

arXiv:1912.13200 (cs)

[Submitted on 31 Dec 2019 (v1), last revised 1 Jul 2021 (this version, v6)]

Title:AdderNet: Do We Really Need Multiplications in Deep Learning?

Authors:Hanting Chen, Yunhe Wang, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu

View PDF

Abstract:Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer. The codes are publicly available at: this https URL.

Comments:	New version in arXiv:2105.14202
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1912.13200 [cs.CV]
	(or arXiv:1912.13200v6 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1912.13200
Journal reference:	CVPR 2020

Submission history

From: Hanting Chen [view email]
[v1] Tue, 31 Dec 2019 06:56:47 UTC (298 KB)
[v2] Thu, 2 Jan 2020 06:26:02 UTC (298 KB)
[v3] Thu, 9 Jan 2020 02:31:03 UTC (298 KB)
[v4] Fri, 28 May 2021 03:25:46 UTC (536 KB)
[v5] Tue, 29 Jun 2021 09:53:28 UTC (536 KB)
[v6] Thu, 1 Jul 2021 04:46:58 UTC (536 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:AdderNet: Do We Really Need Multiplications in Deep Learning?

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:AdderNet: Do We Really Need Multiplications in Deep Learning?

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators