Improved YOLOv8s with Swin Transformer and Depthwise Convolutions for Small-Target Pepper Detection and Localization in Agricultural Robotics

Abstract

A recognition and localization system for chili picking robots was developed based on an improved YOLOv8s model and a RealSense depth camera. The proposed model integrates the Swin Transformer, DW Conv, and C2 modules into the YOLOv8s framework to enhance small-target detection and reduce computational complexity. A dataset containing 2,000 field images of Chaotian pepper (Capsicum frutescens L.) was collected under varying lighting and occlusion conditions, and divided into training, validation, and test sets (7:2:1). To validate the effectiveness of the proposed approach, comparative experiments were conducted against YOLOv5, YOLOv6, YOLOv7, and the original YOLOv8s models. Ablation studies demonstrated that each added component improved model performance, with the combined integration achieving the best results. The improved YOLOv8s model reached a mean Average Precision (mAP) of 82.7%, Recall (R) of 93.0%, and Precision (P) of 79.0%, representing respective increases of 3.4%, 3.0%, and 5.7% compared with the baseline YOLOv8s. These results confirm that the improved YOLOv8s model achieves accurate and efficient chili recognition and localization suitable for robotic harvesting applications.

References

Zou Xuexiao, Ma YQ, Dai XZ, et al. Pepper dissemination and industrial development in China [J]. Acta Horticultura Sinica, 2020, 47 (09):1715-1726.

Zou Xuexiao, Zhu Fan. Origin, evolution and cultivation history of pepper [J]. Acta Horticultura Sinica, 2022, 49 (06):1371-1381.

Zou Xuexiao, Hu Bo Wen, Xiong Cheng, et al. Review and prospect of pepper breeding in China in the past 60 years [J]. Acta Horticultura Sinica, 2022, 49 (10):2099-2118.

Saddik A, Latif R, Taher F, El Ouardi A, Elhoseny M. Mapping agricultural soil in greenhouse using an autonomous low-cost robot and precise monitoring. Sustainability. 2022 Dec;14(23):15539. doi:10.3390/su142315539.

Zuo MHQ, Zhao YH, Yu SS. Industrial robot applications and individual migration decision: evidence from households in China. Humanities & Social Sciences Communications. 2024 Aug 9;11(1):1022. doi:10.1057/s41599-024-03542-z.

Yu KZ, Shi Y, Feng JH. The influence of robot applications on rural labor transfer. Humanities & Social Sciences Communications. 2024 Jun 20;11(1):796. doi:10.1057/s41599-024-03333-6.

Aivazidou E, Tsolakis N. Transitioning towards human-robot synergy in agriculture: a systems thinking perspective. Systems Research and Behavioral Science. 2023 May;40(3):536–551. doi:10.1002/sres.2887.

Adamides G, Katsanos C, Parmet Y, Christou G, Xenos M, Hadzilacos T, Edan Y. HRI usability evaluation of interaction modes for a teleoperated agricultural robotic sprayer. Applied Ergonomics. 2017 Jul;62:237–246. doi:10.1016/j.apergo.2017.03.008.

Liu Sixing, Li Shuang, Miao Hong, et al. Research on identification and localization of pepper picking robot based on YOLOv3 in different scenes [J]. Agricultural Mechanization Research, 2024, 46 (02):38-43.

Wei Tianyu, Liu Tianhong, Zhang Shanwen, et al. Identification and localization method of pepper picking robot based on improved YOLOv5s [J]. Journal of Yangzhou University (Natural Science Edition), 2023, 26 (01):61-69.

Chen Dexin. Fruit recognition and location of bell pepper based on binocular vision [D]. Henan Agricultural University, 2023.

Wang Long. Semantic segmentation algorithm based on convolutional neural network and its application in sweet pepper image recognition [D]. Jiangsu University, 2022.

Li Lian, Ding Wenkuan. Pepper recognition based on convolutional neural network [J]. Journal of Tianjin University of Technology, 2017, 33 (03):12-15.

Zhong Shihao. Research on clustering pepper target recognition and localization algorithm based on deep learning [D]. Guizhou Normal University, 2024.

Huang Huacheng. Study on maturity and damage identification of fresh pepper based on hyperspectral technology [D]. Guizhou University, 2023.

Terven J, Córdova-Esparza D M, Romero-González J A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1680-1716.

Guo, Yunhui, et al. "Depthwise convolution is all you need for learning multiple visual domains. " Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. No. 01. 2019.

Chollet, François. "Xception:Deep learning with depthwise separable convolutions. " Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

Liu, Ze, et al. "Swin transformer: Hierarchical vision transformer using shifted windows. " Proceedings of the IEEE/CVF international conference on computer vision. 2021.

Faxon HO. Small farmers, big tech: agrarian commerce and knowledge on Myanmar Facebook. Agriculture and Human Values. 2023 Sep;40(3):897–911. doi:10.1007/s10460-023-10446-2.

Authors

  • Zhiyuan Tan 1.College of Mechanical Engineering, Zhejiang Sci-Tech University 2. Key Laboratory of Planting Equipment Technology of Zhejiang Province
  • Jianneng Chen 1.College of Mechanical Engineering, Zhejiang Sci-Tech University 2. Key Laboratory of Planting Equipment Technology of Zhejiang Province
  • Chuanyu Wu 1.College of Mechanical Engineering, Zhejiang Sci-Tech University 2. Key Laboratory of Planting Equipment Technology of Zhejiang Province
  • Leiying He 1.College of Mechanical Engineering, Zhejiang Sci-Tech University 2. Key Laboratory of Planting Equipment Technology of Zhejiang Province
  • Kun Yao 1.College of Mechanical Engineering, Zhejiang Sci-Tech University 2. Key Laboratory of Planting Equipment Technology of Zhejiang Province

DOI:

https://doi.org/10.31449/inf.v50i5.11556

Downloads

Published

02/02/2026

How to Cite

Tan, Z., Chen, J., Wu, C., He, L., & Yao, K. (2026). Improved YOLOv8s with Swin Transformer and Depthwise Convolutions for Small-Target Pepper Detection and Localization in Agricultural Robotics. Informatica, 50(5). https://doi.org/10.31449/inf.v50i5.11556