Reinforcement Learning based positive phototaxis of mobile robot with minimal sensing

Wuoti, Onni

Reinforcement Learning based positive phototaxis of mobile robot with minimal sensing

dc.contributor.author	Wuoti, Onni
dc.contributor.department	fi=Kone- ja materiaalitekniikan laitos\|en=Department of Mechanical and Materials Engineering\|
dc.contributor.faculty	fi=Teknillinen tiedekunta\|en=Faculty of Technology\|
dc.contributor.studysubject	fi=Konetekniikka\|en=Mechanical Engineering\|
dc.date.accessioned	2026-06-03T19:31:46Z
dc.date.issued	2026-05-26
dc.description.abstract	In this thesis a Q-learning based reinforcement learning algorithm was created to perform gradient following behavior in the form of phototaxis with a mobile robot. The framework was made to work under minimal sensing which used a single sensor for reward value tracking. To learn preferences over active and passive phases of movement, a four-state space was used by the algorithm. The proposed methodology contrasts previous approaches by using a single sensor instead of multiple, and by relying on learned control instead of direct motor control. The algorithm was tested in both simulation and an experimental setup to determine robustness and to test the generalization of the algorithm. The results were evaluated based on the robot’s movement towards the highest reward point and a one meter goal area around the peak. After arrival the robot spent majority of its time near the peak, and it learned to move towards its current heading when oriented towards the highest reward area. The results demonstrated consistent gradient following behavior for phototaxis with only a single sensor in both simulated and experimental environments. In the future this algorithm should be researched further by filtering sensor readings, attempting different parameter combinations, and experimenting with the usage of multiple agents or reward sources.
dc.description.abstract	Tässä diplomityössä testattiin Q-oppimiseen perustuvaa vahvistusoppimisalgoritmia gradientin seuraamiseen fototaksian muodossa mobiilirobotilla. Runko toimii minimaalisen sensoritekniikan avulla, joka käytti yhtä sensoria palkinnon arvon seuraamiseen. Algoritmi käytti neljän tilan tila-avaruutta oppiakseen mieltymyksiä, milloin valita aktiivinen tai passiivinen vaihe liikkeelle. Ehdotettu menetelmä eroaa edellisistä lähestymistavoista käyttäen yhtä sensoria usean sijaan ja käyttämällä opittua ohjausta suoran moottoriohjauksen sijaan. Algoritmin häiriönsietokyvyn määrittämiseksi sekä sen yleistettävyyden testaamiseksi muissa ympäristöissä, se testattiin sekä simulaatiossa että kokeellisessa ympäristössä. Saadut tulokset arvioitiin perustuen robotin liikkeeseen kohti korkeinta palkintopistettä, ja yhden metrin kokoista maalialuetta huipun ympärillä. Robotti vietti suurimman osan ajastaan huipun läheisyydessä saapumisen jälkeen ja oppi liikkumaan kohti nykyistä kulkusuuntaansa, kun se oli kohti korkeinta palkintoaluetta. Tulokset osoittivat fototaksisen gradientin seurannan olevan mahdollista yksittäisellä sensorilla sekä simuloiduissa että kokeellisissa ympäristöissä. Tulevaisuudessa algoritmia tulisi tutkia suodattamalla sensorin lukemia, testaamalla erilaisia parametrien yhdistelmiä, ja kokeilemalla useamman agentin tai palkintolähteen käyttämistä.
dc.format.extent	77
dc.identifier.uri	https://www.utupub.fi/handle/11111/61582
dc.identifier.urn	URN:NBN:fi-fe2026060362863
dc.language.iso	eng
dc.rights	fi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.\|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.\|
dc.rights.accessrights	avoin
dc.subject	phototaxis
dc.subject	reinforcement learning
dc.subject	mobile robot
dc.subject	Q-learning
dc.subject	Markov switching
dc.subject	gradient following
dc.subject	fototaksia
dc.subject	vahvistusoppiminen
dc.subject	mobiilirobotti
dc.subject	Q-oppiminen
dc.subject	Markov vaihtelu
dc.subject	gradientin seuranta
dc.title	Reinforcement Learning based positive phototaxis of mobile robot with minimal sensing
dc.type.ontasot	fi=Diplomityö\|en=Master's thesis\|

Tiedostot

Näytetään 1 - 1 / 1

Name:: Masters_thesis_Onni_Wuoti.pdf
Size:: 3.35 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Pro gradu -tutkielmat ja diplomityöt sekä syventävien opintojen opinnäytetyöt (kokotekstit)