A ViT-DQN-Based Real-Time Martial Arts Training System with Multimodal Fusion for Action Recognition and Optimization
Abstract
This paper presents an intelligent martial arts training system that integrates computer vision and reinforcement learning to address the inefficiencies, lack of personalization, and delayed feedback in traditional martial arts instruction. The system employs a Vision Transformer (ViT) for real-time action recognition and a Deep Q-Network (DQN) for training strategy optimization, enabling precise, adaptive feedback for athletes. By combining deep learning with IoT sensor data, the system analyzes posture, movement accuracy, and exercise intensity in real-time to maximize training effectiveness. A large-scale experiment involving 200 martial arts practitioners across multiple age groups demonstrated that the system achieved high recognition accuracy for key movements—96.8% for chopping, 98.1% for kicking, and 96.8% for grappling—significantly outperforming traditional CNN- and LSTM-based models. In terms of fluency optimization, the DQN model surpassed PPO and A3C with near-perfect fluency scores for chopping and sidekick. Moreover, athletes using the system achieved notable improvements in competitive outcomes: the under-18 group’s win rate rose from 65% to 85%, while the 23–27 age group improved from 75% to 90%. These findings validate the system’s effectiveness in enhancing training efficiency and technical precision and demonstrate the potential of artificial intelligence for intelligent martial arts instruction and broader sports training applications.
Full Text:
PDFReferences
Zhang Hao . Application value and practice of heart rate monitoring in school sports training and competition[J]. Contemporary Sports Science and Technology, 2022, 12(14): 19-22.
Sang Mengli . Analysis of causes of sports injuries and recovery methods in college sports training[J]. Contemporary Sports Science and Technology, 2021, 11(15): 18-20.
Chen Ronghao , Guo Hao. Analysis on the development and inheritance of martial arts in Chongqing from the perspective of national fitness [J]. Journal of Southwest Normal University (Natural Science Edition), 2020, 45(2): 123-127.
Zheng Tongjun , Chen Dan, Chen Lanlan. Research on the influence of martial arts on the cardiopulmonary function of college students[J]. Contemporary Sports Science and Technology, 2021, 11(4): 26-29.
Al-Faris M, Chiverton J, Ndzi D, et al. A review on computer vision-based methods for human action recognition[J]. Journal of imaging, 2020, 6(6): 46.
Kong Y, Fu Y. Human action recognition and prediction: A survey[J]. International Journal of Computer Vision, 2022, 130(5): 1366-1401.
Zhang K, Li Y, Wang J, et al. Real-time video emotion recognition based on reinforcement learning and domain knowledge[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 32(3): 1034-1047.
Weng J, Jiang X, Zheng WL, et al. Early action recognition with category exclusion using policy-based reinforcement learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(12): 4626-4638.
Meng Z, Zhang M, Guo C, et al. Recent progress in sensing and computing techniques for human activity recognition and motion analysis[J]. Electronics, 2020, 9(9): 1357.
Babangida L, Perumal T, Mustapha N, et al. Internet of things (IoT) based activity recognition strategies in smart homes: a review[J]. IEEE sensors journal, 2022, 22(9): 8327-8336.
Bird JJ, Ek a rt A, Faria D R. British sign language recognition via late fusion of computer vision and leap motion with transfer learning to american sign language[J]. Sensors, 2020, 20(18): 5151.
Cao J, Tanjo Y. High-Accuracy Human Motion Recognition Independent of Motion Direction Using A Single Camera[J]. International Journal of Innovative Computing, Information and Control, 2024, 20(4): 1093-1103.
Shi Yuexiang , Zhu Maoqing . Collaborative convolutional Transformer network for skeleton action recognition[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1485-1493.
Luo Huilan , Chen Han. Spatiotemporal convolutional attention network for action recognition[J]. Journal of Computer Engineering & Applications, 2023, 59(9).
Lv Shuping, Huang Yi, Wang Yingying . Research on human action recognition based on two-stream convolutional neural network[J]. Experimental Technology & Management, 2021, 38(8).
Zhang Lili , Liu Bo, Qu Lele, et al. Human motion recognition based on FMCW radar based on feature fusion convolutional neural network[J]. Telecommunication Engineering, 2022, 62(2).
Zhang H, Yang K, Cao G, et al. ViT-LLMR : Vision Transformer-based lower limb motion recognition from fusion signals of MMG and IMU[J]. Biomedical Signal Processing and Control, 2023, 82: 104508.
Wensel J, Ullah H, Munir A. Vit-ret: Vision and recurrent transformer neural networks for human activity recognition in videos[J]. IEEE Access, 2023.
Zhang Xiaolong, Wang Qingwei, Li Shangbin . Multimodal scene human dangerous behavior recognition method based on reinforcement learning[J]. Journal of Applied Sciences, 2021, 39(4): 605-614.
ZHANG Wei, TAN Wenhao, LI Yibin. Current status and prospects of quadruped robot motion control based on deep reinforcement learning[J]. Journal of Shandong University (Medical Science), 2020, 58(8): 61-66.
Chao ZH, Ya Long Y, Yi L, et al. Deep Q Learning-Enabled Training and Health Monitoring of Basketball Players Using IoT Integrated Multidisciplinary Techniques[J]. Mobile Networks and Applications, 2024: 1-16.
Omstedt F. A deep reinforcement learning approach to the problem of golf using an agent limited by human data[J]. 2020.
Casgrain P, Ning B, Jaimungal S. Deep Q-learning for Nash equilibria: Nash-DQN[J]. Applied Mathematical Finance, 2022, 29(1): 62-78.
Yin Y, Zhang X, Zhan S, et al. DQN regenerative braking control strategy based on adaptive weight coefficients[J]. Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering, 2024, 238(10-11): 2956-2966.
DOI: https://doi.org/10.31449/inf.v49i28.8606

This work is licensed under a Creative Commons Attribution 3.0 License.