Improved Multi-Target Athlete Tracking in Sports Videos Using IYOLOv8-MTD and Enhanced DeepSORT with Hybrid Attention and IMM

Abstract

The data-driven development of competitive sports has raised higher demands for precise capture and analysis of athletes' movement details. To improve the accuracy and continuity of multi-target detection and tracking in sports scenes, this article constructs a multi-target detection model based on improved YOLOv8 (IYOLOv8-MTD) and a multi-target tracking model based on improved DeepSORT (IDeepSORT-MTT), and improves performance through multi-module collaborative optimization. The specific method innovations are as follows: In the detection module (IYOLOv8-MTD), the convolutional block attention module (CBAM) is optimized through the global context transformer (GCT) to enhance key feature responses, the large selection kernel (LSK) module is introduced to reconstruct the C2f module to adapt to multi-scale targets, and the Inner Intersection over union (IIOU) and multi-part detection over union (MPDIoU) optimize the loss function to improve the bounding box regression accuracy; in the tracking module (IDeepSORT-MTT), the interactive multi-model (IMM) Kalman filter is introduced to fuse the uniform/uniform acceleration model to adapt to the nonlinear state of the moving target, a hybrid attention mechanism (channel + spatial feature weighted fusion) is designed to enhance the discriminability of appearance features, and a heat map detector is used to assist positioning to reduce positioning deviation. The experiment is verified on the SportsMOT data set (including 240 videos with a total of about 150,000 frames, divided at 8:1:1 into 192 segments of 120,000 frames for the training set, 24 segments of the validation set for 15,000 frames, and 24 segments of the test set for 15,000 frames). The hardware platform is NVIDIA GeForce RTX 2080Ti GPU and Intel i5-10400F CPU, using the standard MOTEval Tool evaluation. The results show that the detection model IYOLOv8-MTD has a mAP50 of 97.14% and a mAP50-95 of 92.22%, which are significantly better than the traditional YOLOv8 (mAP50 92.78%, mAP50-95 81.67%); the tracking model IDeepSORT-MTT has an average multi-target tracking accuracy (MOTA) of 92.81%, and the identity The average F1 value (IDF1) is 77.56%, the number of identity switching is reduced by 66.3% compared with the original DeepSORT, and the processing speed is maintained at 7.1-8.0 frames/second (FPS). The overall performance of the model is superior to traditional methods and comparative studies, effectively improving the accuracy and continuity of multi-target detection and tracking in complex sports scenarios, and providing a reliable technical solution for athlete trajectory analysis, tactical review and physical fitness assessment.

Authors

  • Shanqing Wan
  • Wei Chen

DOI:

https://doi.org/10.31449/inf.v49i35.10859

Downloads

Published

12/16/2025

How to Cite

Wan, S., & Chen, W. (2025). Improved Multi-Target Athlete Tracking in Sports Videos Using IYOLOv8-MTD and Enhanced DeepSORT with Hybrid Attention and IMM. Informatica, 49(35). https://doi.org/10.31449/inf.v49i35.10859