JM Pinder
Multi-objective reinforcement learning framework for unknown stochastic & uncertain environments
Pinder, JM
Authors
Contributors
S Nefti-Meziani S.Nefti-Meziani@salford.ac.uk
Supervisor
Dr Theodoros Theodoridis T.Theodoridis@salford.ac.uk
Supervisor
Abstract
This dissertation focuses on the problem of uncertainty handling during learning, by agents dealing in stochastic environments by means of Multi Objective Reinforcement Learning (MORL). Most previous investigations into multi objective reinforcement learning have proposed algorithms to deal with the learning performance issues but have neglected the uncertainty present in stochastic environments. The realisation that multiple long term objectives are exhibited in many risky and uncertain real-world decision making problems forms the principle motivation of this research.
This dissertation proposes a novel modification to the single objective GPFRL algorithm (Hinojosa et al, 2008) where, the implementation of a linear scalarisation methodology provides a way to automatically find an optimal policy for multiple objectives under different kinds of uncertainty. The proposed Generalised Probabilistic Fuzzy Multi Objective Reinforcement Learning (GPFMORL) algorithm is further enhanced by the introduction of prospect theory to guarantee convergence by the means of risk evaluation. The simulated grid world increased in complexity as a further two complementary and conflicting objectives were specified whilst also introducing uncertainty in the form of stochastic cross winds.
Results obtained from the GPFMORL grid world simulations were compared against two more classical multi objective algorithms, MOQ and MOSARSA, showing not only a stronger convergence but also a much faster one. Experiments performed on an actual Quad-Copter/Drone demonstrated that the proposed algorithm and developed framework are both feasible and promising for the control of Artificially Intelligent (AI) Unmanned Aerial Vehicles (UAV) in a variety of real-world multi objective applications such as; autonomous landing/delivery or search and rescue.
Furthermore, the observed results of this work showed that the GPFMORL method can find its major real world application in the un-calibrated control of non-linear, multiple inputs, and multiple output systems, especially in multi objective situations with high uncertainty. Proposed novel case study research prototype examples include: Controlled Environment Agriculture for optimising Hydroponic Crop Growth by the proposed “Automated Solar Powered Environmental Controller” (ASPEC). Finally the “Robotic Dementia Medication Administration System” (RDMAS) attempts to optimise liquid medication dispensing via intelligent scheduling to more appropriate times of the day when the patient is more likely to remember to take their medication, based upon previous learned knowledge and experience.
Citation
Pinder, J. Multi-objective reinforcement learning framework for unknown stochastic & uncertain environments. (Thesis). University of Salford
Thesis Type | Thesis |
---|---|
Deposit Date | Dec 8, 2016 |
Publicly Available Date | Dec 8, 2016 |
Additional Information | Funders : Engineering and Physical Sciences Research Council (EPSRC) |
Award Date | Aug 14, 2016 |
Files
John Pinder PhD Thesis Complete.pdf
(5.9 Mb)
PDF
You might also like
Using an EeonTex Conductive Stretchable Elastic Fibre for Hand Action Recognition
(2023)
Journal Article
Modified Nonlinear Hysteresis Approach for a Tactile Sensor
(2023)
Journal Article
Perspective distortion modeling for image measurements
(2020)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search