Learning Algorithm for LesserDNN, a DNN with Quantized Weights
Abstract
This paper presents LesserDNN, a model that uses a set of floating-point values \{-1.0, -0.5, -0.25, -0.125, -0.0625, 0.0625, 0.125, 0.25, 0.5, 1.0\} as quantized weights, and a new learning algorithm for the proposed model.In previous studies on deep neural networks (DNNs) with quantized weights, because DNNs employ the gradient descent method as their learning algorithm, quantized weights were applied only during the inference stage.Due to differentiability properties, quantized weights cannot be used when the gradient descent method is applied during training.To address this issue, we devised an algorithm based on simulated annealing.Since simulated annealing has no differentiability requirements, LesserDNN can utilize quantized weights during training. With the use of quantized weights and this simulated annealing-based algorithm, the learning process becomes a combinatorial problem. The proposed algorithm was applied to train networks on the MNIST handwriting dataset. The tested models were trained with the simulated annealing-based algorithm and quantized weights, achieving the same level of accuracy as gradient descent-based comparison methods. Additionally, we conducted tests using the CIFAR-10 dataset, and achieved the good results to demonstrate the algorithm.Thus, LesserDNN has a simple design and small implementation scale because backpropagation is not applied. Moreover, this model achieves a high accuracy.References
Vincent Vanhoucke and Andrew Senior and Mark Z. Mao (2011), Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
Jingyong Cai and Masashi Takemoto and Hironori Nakajo (2018), Deep Learning, Neural Networks Compression, Computer Vision, Logarithmic Quantization, Proceedings of the 10th International Conference on Advances in Information Technology, Association for Computing Machinery.
Seyyed Mohammad Mousavi and Elham S. Mostafavi and Pengcheng Jiao (2017), Next generation prediction model for daily solar radiation on horizontal surface using a hybrid neural network and simulated annealing method, Energy Conversion and Management, Elsevier, pp. 671-682.
Matthieu Courbariaux and Yoshua Bengio and Jean-Pierre David (2016) BinaryConnect: Training Deep Neural Networks with binary weights during propagations, arXiv.
Fengfu Li and Bin Liu and Xiaoxing Wang and Bo Zhang and Junchi Yan (2022) Ternary Weight Networks, arXiv.
Diederik P. Kingma and Jimmy Ba (2017) Adam: A Method for Stochastic Optimization, arXiv.
DOI:
https://doi.org/10.31449/inf.v49i1.7145Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







