Hyperparameter Optimization for Convolutional Neural Networks using the Salp Swarm Algorithm

Entesar Abdulsaed, Maytham Alabbas, Raidah Khudeyer

Abstract


Convolutional neural networks (CNNs) have exceptionally performed across various computer vision tasks. However, their effectiveness depends heavily on the careful selection of hyperparameters. Optimizing these hyperparameters can be challenging and time-consuming, especially when working with large datasets and complex network architectures. In response, we propose a novel approach for hyperparameter optimization in CNNs using the Salp Swarm Algorithm (SSA). Based on the natural behavior of mollusks, SSA mimics the collective intelligence that governs feeding and navigation. Taking advantage of SSA's unique properties, our research thoroughly explores the hyperparameter space. This exploration aims to identify the algorithm that maximizes CNNs performance. This paper presents the architecture of the SSA-based framework for hyperparameter optimization and compares it to other established optimization techniques, such as Particle Swarm Optimization (PSO) and Genetic Algorithm (GA). We also present experimental results using the MNIST dataset, achieving an impressive classification accuracy of 99.46%. This case study not only contributes to the fields of deep learning and hyperparameter optimization by demonstrating the effectiveness of SSA in optimizing CNNs, but it also provides benefits to researchers and practitioners who are looking for optimal hyperparameter configurations for CNNs in a variety of computer vision applications. We also evaluate the scalability and robustness of our proposed method in the context of different CNNs structures. The insights we gained highlight SSA's potential for addressing challenges related to hyperparameter optimization.

Full Text:

PDF

References


Gadri, S., Developing an efficient predictive model based on ml and dl approaches to detect diabetes. Informatica, 2021. 45(3).

Abdulla, M. and A. Marhoon, Agriculture based on Internet of Things and Deep Learning. Iraqi Journal for Electrical and Electronic Engineering, 2022. 18(2): p. 1-8.

Xu, Y., et al., Batch normalization with enhanced linear transformation. arXiv preprint arXiv:2011.14150, 2020.

Shrestha, A. and A. Mahmood, Review of deep learning algorithms and architectures. IEEE access, 2019. 7: p. 53040-53065.

Hassan, N.F.A., A.A. Abed, and T.Y. Abdalla, Face mask detection using deep learning on NVIDIA Jetson Nano. International Journal of Electrical & Computer Engineering (2088-8708), 2022. 12(5).

Gaafar, A.S., J.M. Dahr, and A.K. Hamoud, Comparative Analysis of Performance of Deep Learning Classification Approach based on LSTM-RNN for Textual and Image Datasets. Informatica, 2022. 46(5).

Wang, Y., H. Zhang, and G. Zhang, cPSO-CNN: An efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm and Evolutionary Computation, 2019. 49: p. 114-123.

Darwish, A., D. Ezzat, and A.E. Hassanien, An optimized model based on convolutional neural networks and orthogonal learning particle swarm optimization algorithm for plant diseases diagnosis. Swarm and evolutionary computation, 2020. 52: p. 100616.

Alzubaidi, L., et al., Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data, 2021. 8(1): p. 53.

LeCun, Y., The MNIST database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.

Zhang, H., et al., Differential evolution-assisted salp swarm algorithm with chaotic structure for real-world problems. Eng Comput, 2022. 39(3): p. 1735-1769.

Syulistyo, A.R., et al., Particle swarm optimization (PSO) for training optimization on convolutional neural network (CNN). Jurnal Ilmu Komputer dan Informasi, 2016. 9(1): p. 52-58.

Ayumi, V., et al. Optimization of convolutional neural network using microcanonical annealing algorithm. in 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS). 2016. IEEE.

Yoo, J.-H., et al. Optimization of hyper-parameter for CNN model using genetic algorithm. in 2019 1st International conference on electrical, control and instrumentation engineering (ICECIE). 2019. IEEE.

Guo, Y., J.-Y. Li, and Z.-H. Zhan, Efficient hyperparameter optimization for convolution neural networks in deep learning: A distributed particle swarm optimization approach. Cybernetics and Systems, 2020. 52(1): p. 36-57.

Bacanin, N., et al., Optimizing Convolutional Neural Network Hyperparameters by Enhanced Swarm Intelligence Metaheuristics. Algorithms, 2020. 13(3).

Ma, B., et al., Autonomous deep learning: A genetic DCNN designer for image classification. Neurocomputing, 2020. 379: p. 152-161.

Serizawa, T. and H. Fujita, Optimization of convolutional neural network using the linearly decreasing weight particle swarm optimization. arXiv preprint arXiv:2001.05670, 2020.

Nistor, S.C. and G. Czibula, IntelliSwAS: Optimizing deep neural network architectures using a particle swarm-based approach. Expert Systems with Applications, 2022. 187: p. 115945.

Moodie, E.E. and D.A. Stephens, Comment: Clarifying endogeneous data structures and consequent modelling choices using causal graphs. 2020.

Challapalli, J.R. and N. Devarakonda, A novel approach for optimization of convolution neural network with hybrid particle swarm and grey wolf algorithm for classification of Indian classical dances. Knowledge and Information Systems, 2022. 64(9): p. 2411-2434.

Raji, I.D., et al., Simple deterministic selection-based genetic algorithm for hyperparameter tuning of machine learning models. Applied Sciences, 2022. 12(3): p. 1186.

Altwaijry, N. and I. Al-Turaiki, Arabic handwriting recognition system using convolutional neural network. Neural Computing and Applications, 2021. 33(7): p. 2249-2261.

Ren, L., et al., A data-driven auto-CNN-LSTM prediction model for lithium-ion battery remaining useful life. IEEE Transactions on Industrial Informatics, 2020. 17(5): p. 3478-3487.

Ashraf, A.H., et al., Weapons detection for security and video surveillance using cnn and YOLO-v5s. CMC-Comput. Mater. Contin, 2022. 70: p. 2761-2775.

Zamir, M., et al., Face Detection & Recognition from Images & Videos Based on CNN & Raspberry Pi. Computation, 2022. 10(9): p. 148.

Li, C., et al., Segmenting objects in day and night: Edge-conditioned CNN for thermal image semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems, 2020. 32(7): p. 3069-3082.

Haque, M.A., et al. Experimental evaluation of CNN architecture for speech recognition. in First International Conference on Sustainable Technologies for Computational Intelligence: Proceedings of ICTSCI 2019. 2020. Springer.

Khudeyer, R.S. and N.M. Almoosawi, Combination of machine learning algorithms and Resnet50 for Arabic Handwritten Classification. Informatica, 2023. 46(9).

Fregoso, J., C.I. Gonzalez, and G.E. Martinez, Optimization of convolutional neural networks architectures using PSO for sign language recognition. Axioms, 2021. 10(3): p. 139.

Alhijaj, J.A. and R.S. Khudeyer, Integration of EfficientNetB0 and Machine Learning for Fingerprint Classification. Informatica, 2023. 47(5).

Al, N.M.A.-M.M. and R.S. Khudeyer, ResNet-34/DR: a residual convolutional neural network for the diagnosis of diabetic retinopathy. Informatica, 2021. 45(7).

Mirjalili, S., et al., Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Advances in engineering software, 2017. 114: p. 163-191.

Duan, Q., et al., Improved salp swarm algorithm with simulated annealing for solving engineering optimization problems. Symmetry, 2021. 13(6): p. 1092.

Faris, H., et al., Salp swarm algorithm: theory, literature review, and application in extreme learning machines. Nature-inspired optimizers: theories, literature reviews and applications, 2020: p. 185-199.

Wu, H., CNN-Based Recognition of Handwritten Digits in MNIST Database. Research School of Computer Science. The Australia National University, Canberra, 2018.




DOI: https://doi.org/10.31449/inf.v47i9.5148

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.