A Proximal Policy Optimization-Based Reinforcement Learning Framework for Real-Time Personalized Endurance Training

Abstract

Customized sports training routines take into account individual physiology, fatigue, and recovery to maximize performance. Proximal Policy Optimization (PPO)-based reinforcement learning is used to adjust training intensity, duration, and rest in a simulated endurance-training environment for runners, using real-time wearable and performance data. The environment models athlete status utilizing heart rate variability, VO₂ max, fatigue ratings, and injury-risk indicators. PPO is trained to maximize performance gains, recovery quality, and safety over repeated sessions. Simulated policy improves performance (18.6%), injury-risk deviation (−22.4%), recovery compliance (91.3%), training load variability control (±7.2%), reward-signal evolution convergence (+41.7%), session completion rate (94.6%), personalized adaptation score (87.5%), and fatigue index stability (94.3%). Results show that a PPO-based RL setup, specifically defined by state design, reward shaping, and multi-episode training, can provide adaptive and data-driven tailored sports training.

Authors

  • Chuanzhong Wu Guangzhou College of Commerce
  • Danqing Liang Xianda College of Economics and Humanities, Shanghai International Studies University
  • Bo Yang Xianda College of Economics and Humanities, Shanghai International Studies University
  • Li Xu Guangzhou College of Commerce
  • Yunlong Li Xianda College of Economics and Humanities, Shanghai International Studies University

DOI:

https://doi.org/10.31449/inf.v50i8.10131

Downloads

Published

02/21/2026

How to Cite

Wu, C., Liang, D., Yang, B., Xu, L., & Li, Y. (2026). A Proximal Policy Optimization-Based Reinforcement Learning Framework for Real-Time Personalized Endurance Training. Informatica, 50(8). https://doi.org/10.31449/inf.v50i8.10131