How to run a world record? A Reinforcement Learning approach

Shahsavari Sajad; Immonen Eero; Karami Masoomeh; Haghbayan Mohammadhashem; Plosila Juha

How to run a world record? A Reinforcement Learning approach

dc.contributor.author	Shahsavari Sajad
dc.contributor.author	Immonen Eero
dc.contributor.author	Karami Masoomeh
dc.contributor.author	Haghbayan Mohammadhashem
dc.contributor.author	Plosila Juha
dc.contributor.organization	fi=robotiikka ja autonomiset järjestelmät\|en=Robotics and Autonomous Systems\|
dc.contributor.organization-code	1.2.246.10.2458963.20.72785230805
dc.converis.publication-id	175657877
dc.converis.url	https://research.utu.fi/converis/portal/Publication/175657877
dc.date.accessioned	2022-10-28T13:41:40Z
dc.date.available	2022-10-28T13:41:40Z
dc.description.abstract	<p>Finding the optimal distribution of exerted effort by an athlete in competitive sports has been widely investigated in the fields of sport science, applied mathematics and optimal control. In this article, we propose a reinforcement learning-based solution to the optimal control problem in the running race application. Well-known mathematical model of Keller is used for numerically simulating the dynamics in runner's energy storage and motion. A feed-forward neural network is employed as the probabilistic controller model in continuous action space which transforms the current state (position, velocity and available energy) of the runner to the predicted optimal propulsive force that the runner should apply in the next time step. A logarithmic barrier reward function is designed to evaluate performance of simulated races as a continuous smooth function of runner's position and time. The neural network parameters, then, are identified by maximizing the expected reward using on-policy actor-critic policy-gradient RL algorithm. We trained the controller model for three race lengths: 400, 1500 and 10000 meters and found the force and velocity profiles that produce a near-optimal solution for the runner's problem. Results conform with Keller's theoretical findings with relative percent error of 0.59% and are comparable to real world records with relative percent error of 2.38%, while the same error for Keller's findings is 2.82%.<br></p>
dc.format.pagerange	166
dc.identifier.isbn	978-3-937436-77-7
dc.identifier.issn	2522-2414
dc.identifier.jour-issn	2522-2414
dc.identifier.olddbid	183674
dc.identifier.oldhandle	10024/166768
dc.identifier.uri	https://www.utupub.fi/handle/11111/40958
dc.identifier.url	https://www.scs-europe.net/dlib/2022/2022-0159.html
dc.identifier.urn	URN:NBN:fi-fe2022081154615
dc.language.iso	en
dc.okm.affiliatedauthor	Karami, Masoomeh
dc.okm.affiliatedauthor	Haghbayan, Hashem
dc.okm.affiliatedauthor	Plosila, Juha
dc.okm.discipline	113 Computer and information sciences	en_GB
dc.okm.internationalcopublication	not an international co-publication
dc.okm.internationality	International publication
dc.okm.type	A4 Conference Article
dc.publisher.country	United Kingdom	en_GB
dc.publisher.country	Britannia	fi_FI
dc.publisher.country-code	GB
dc.relation.conference	European Conference on Modelling and Simulation
dc.relation.ispartofjournal	Proceedings: European Conference for Modelling and Simulation
dc.relation.ispartofseries	Proceedings : European Conference for Modelling and Simulation
dc.relation.volume	36
dc.source.identifier	https://www.utupub.fi/handle/10024/166768
dc.title	How to run a world record? A Reinforcement Learning approach
dc.title.book	Proceedings of the 36th ECMS International Conference on Modelling and Simulation ECMS 2022 May 30th – June 3rd, 2022, Ålesund, Norway
dc.year.issued	2022

Tiedostot

Näytetään 1 - 1 / 1

Name:: 0159_simo_ecms2022_0049.pdf
Size:: 1.05 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Rinnakkaistallenteet