SA-FGSM: A Simulated Annealing-Enhanced Hybrid White-Box Adversarial Attack Framework

Djawhara Benchaira; Foudil Cherif

doi:10.31449/inf.v50i7.10877

Abstract

The reliable evaluation of adversarial defenses is a critical challenge in deep learning security, often hindered by evaluation methods, such as Projected Gradient Descent (PGD), that can fail by getting trapped in local optima. This limitation can lead to a significant overestimation of a model’s true robustness. In this work, we introduce the Simulated Annealing-Fast Gradient Sign Method (SA-FGSM), a novel two-phase hybrid attack designed to overcome this specific weakness. SA-FGSM first employs Simulated Annealing to perform a global, stochastic exploration of the perturbation space to find promising attack regions, followed by a gradient-based step for finalization. We conduct a comprehensive evaluation on CIFAR-10 and CIFAR-100 against state-of-the-art adversarially trained models (ResNet-18 and WideResNet-28-10), showing that SA-FGSM achieves a mean attack success rate of 83.6% compared to 51.9% for a suite of strong baselines including FGSM, MI-FGSM, PGD, and APGD. Furthermore, we demonstrate that SA-FGSM finds qualitatively superior perturbations, evidenced by a statistically significant reduction in both average ℓ2 norm and perceptual distortion as measured by LPIPS (Learned Perceptual Image Patch Similarity), achieving 58.3% lower perceptual distance than gradient-based baselines. Analysis of the proposed attack variants identifies SA-FGSM-Swift as a particularly compelling option, offering state-of-the-art success rates at a fraction of the computational cost of stronger baselines. Our findings suggest that the robustness of even top-tier defenses may be overestimated and highlight the necessity of incorporating global search heuristics into standard evaluation protocols.

References

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2623–2631, 2019.

Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty,HuanZhang,Cho-JuiHsieh,and ManiBSrivastava.Genattack:Practicalblack-boxattacks with gradient-free optimization. In Proceedings of the genetic and evolutionary computation conference, pages 1111–1119, 2019.

Wieland Brendel, Jonas Rauber, and Matthias Bethge. Decision-based adversarial attacks: Reliable attacks against black-box machine learning

models. arXiv preprint arXiv:1712.04248, 2017.

Anh Bui, Trung Le, He Zhao, Quan Tran, Paul Montague, and Dinh Phung. Generating adversarial examples with task oriented multi-objective optimization. arXiv preprint arXiv:2304.13229, 2023.

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pages 39–57. Ieee, 2017.

Jianbo Chen, Michael I Jordan, and Martin J Wainwright. Hopskipjumpattack: A query-efficient decision-based attack. In 2020 ieee symposium on security and privacy (sp), pages 1277–1294. IEEE, 2020.

Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM workshop on artificial intelligence and security, pages 15–26, 2017.

Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, and Jun Zhu. Efficient black-box adversarial attacks via bayesian optimization guided by a function prior. arXiv preprint arXiv:2405.19098, 2024.

Alvaro HC Correia, Daniel E Worrall, and Roberto Bondesan. Neural simulated annealing. In International Conference on Artificial Intelligence and Statistics, pages 4946–4962. PMLR, 2023.

Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Edoardo Debenedetti, Nicolas Flammarion, Mung Chiang, Prateek Mittal, and Matthias Hein. Robustbench: a standardized ad-

versarial robustness benchmark. arXiv preprint arXiv:2010.09670, 2020.

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185–9193, 2018.

Mingyuan Fan, Cen Chen, Wenmeng Zhou, and Yinggui Wang. Transferable adversarial examples with bayesian approach. In Proceedings of the 20th ACM Asia Conference on Computer and

Communications Security, ASIA CCS ’25, page 517–529, New York, NY, USA, 2025. Association for Computing Machinery.

Zhengwei Fang, Rui Wang, Tao Huang, and Liping Jing. Strong transferable adversarial attacks via ensembled asymptotically normal distribution learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24841–24850, 2024.

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.

XU Keyizhi, LU Yajuan, WANG Zhongyuan, et al. A survey of adversarial examples in computer vision: Attack, defense, and beyond. Wuhan University Journal of Natural Sciences, 30(1):1–20, 2025.

Hoki Kim. Torchattacks: A pytorch repository for adversarial attacks, 2021.

Scott Kirkpatrick, C Daniel Gelatt Jr, and Mario P Vecchi. Optimization by simulated annealing. science, 220(4598):671–680, 1983.

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, ON, Canada, 2009.

Hongying Liu, Zhijin Ge, Zhenyu Zhou, Fanhua Shang, Yuanyuan Liu, and Licheng Jiao. Gradient correction for white-box adversarial attacks. IEEE Transactions on Neural Networks and Learning Systems, 35(12):18419–18430, 2024.

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.

Rayan Mosli, Matthew Wright, Bo Yuan, and Yin Pan. They might not be giants: Crafting black-box adversarial examples with fewer queries sing particle swarm optimization. arXiv preprint arXiv:1909.07490, 2019.

Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A Calian, Florian Stimberg, Olivia Wiles, and Timothy Mann. Fixing data augmentation to improve adversarial robustness. arXiv preprint

arXiv:2103.01946, 2021.

Naufal Suryanto, Hyoeun Kang, Yongsu Kim, Youngyeo Yun, Harashta Tatimma Larasati, and Howon Kim. A distributed black-box adversarial attack based on multi-group particle swarm

optimization. Sensors, 20(24), 2020.

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.

Ayane Tajima and Satoshi Ono. Evoiba: Evolutionary boundary attack under hard-label black box condition. In 2024 IEEE Congress on Evolutionary Computation (CEC), pages 1–8. IEEE, 2024.

Jiafeng Wang, Zhaoyu Chen, Kaixun Jiang, Dingkang Yang, Lingyi Hong, Pinxue Guo, Haijing Guo, and Wenqiang Zhang. Boosting the transferability of adversarial attacks with global

momentum initialization. Expert Systems with Applications, 255:124757, 2024.

Yixiang Wang, Jiqiang Liu, Xiaolin Chang, Jelena Mišić, and Vojislav B. Mišić. Iwa: Integrated gradient based white-box attacks for fooling deep neural networks, 2021.

Phoenix Neale Williams and Ke Li. Black-box sparse adversarial attack via multi-objective optimisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12291–12301, 2023.

Chenwang Wu, Wenjian Luo, Nan Zhou, Peilan Xu, and Tao Zhu. Genetic algorithm with multiple fitness functions for generating adversarial examples. In 2021 IEEE Congress on evolutionary computation (CEC), pages 1792–1799. IEEE, 2021.

Keiichiro Yamamura, Issa Oe, Hiroki Ishikura,and Katsuki Fujisawa. Enhancing output diversity improves conjugate gradient-based adversarial attacks. In International Conference on Pattern Recognition and Artificial Intelligence, pages 47–61. Springer, 2024.

Keiichiro Yamamura, Haruki Sato, Nariaki Tateiwa, Nozomi Hata, Toru Mitsutake, Issa Oe, Hiroki Ishikura, and Katsuki Fujisawa. Diversified adversarial attacks based on conjugate gradient method. In International Conference on Machine Learning, pages 24872–24894. PMLR, 2022.

Xinghao Yang, Weifeng Liu, and Dacheng Tao. Besa: Bert-based simulated annealing for adversarial text attacks. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, 2021.

Shuai Zhou, Chi Liu, Dayong Ye, Tianqing Zhu, Wanlei Zhou, and Philip S. Yu. Adversarial attacks and defenses in deep learning: From a perspective of cybersecurity. ACM Comput. Surv., 55(8), December 2022.

Xianyu Zuo, Xiangyu Wang, Wenbo Zhang, and Yadi Wang. Mispso-attack: An efficient adversarial watermarking attack based on multiple initial solution particle swarm optimization. Applied Soft Computing, 147:110777, 2023.

SA-FGSM: A Simulated Annealing-Enhanced Hybrid White-Box Adversarial Attack Framework

Abstract

References

Authors

DOI:

Downloads

Published

Issue

Section

License

How to Cite

Developed By

Information