Learning Algorithm for LesserDNN, a DNN with Quantized Weights
Abstract
This paper presents LesserDNN, a model that uses a set of floating-point values \{-1.0, -0.5, -0.25, -0.125, -0.0625, 0.0625, 0.125, 0.25, 0.5, 1.0\} as quantized weights, and a new learning algorithm for the proposed model.In previous studies on deep neural networks (DNNs) with quantized weights, because DNNs employ the gradient descent method as their learning algorithm, quantized weights were applied only during the inference stage.Due to differentiability properties, quantized weights cannot be used when the gradient descent method is applied during training.To address this issue, we devised an algorithm based on simulated annealing.Since simulated annealing has no differentiability requirements, LesserDNN can utilize quantized weights during training. With the use of quantized weights and this simulated annealing-based algorithm, the learning process becomes a combinatorial problem. The proposed algorithm was applied to train networks on the MNIST handwriting dataset. The tested models were trained with the simulated annealing-based algorithm and quantized weights, achieving the same level of accuracy as gradient descent-based comparison methods. Additionally, we conducted tests using the CIFAR-10 dataset, and achieved the good results to demonstrate the algorithm.Thus, LesserDNN has a simple design and small implementation scale because backpropagation is not applied. Moreover, this model achieves a high accuracy.References
Vincent Vanhoucke and Andrew Senior and Mark Z. Mao (2011), Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
Jingyong Cai and Masashi Takemoto and Hironori Nakajo (2018), Deep Learning, Neural Networks Compression, Computer Vision, Logarithmic Quantization, Proceedings of the 10th International Conference on Advances in Information Technology, Association for Computing Machinery.
Seyyed Mohammad Mousavi and Elham S. Mostafavi and Pengcheng Jiao (2017), Next generation prediction model for daily solar radiation on horizontal surface using a hybrid neural network and simulated annealing method, Energy Conversion and Management, Elsevier, pp. 671-682.
Matthieu Courbariaux and Yoshua Bengio and Jean-Pierre David (2016) BinaryConnect: Training Deep Neural Networks with binary weights during propagations, arXiv.
Fengfu Li and Bin Liu and Xiaoxing Wang and Bo Zhang and Junchi Yan (2022) Ternary Weight Networks, arXiv.
Diederik P. Kingma and Jimmy Ba (2017) Adam: A Method for Stochastic Optimization, arXiv.
DOI:
https://doi.org/10.31449/inf.v49i1.7145Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







