Stefan is a PhD candidate at the Centre for Doctoral Training in Interactive Artificial Intelligence at the University of Bristol. His research investigates data-driven modelling of dynamical systems for control and model-based reinforcement learning, with a focus on uncertainty quantification. A learnt probabilistic dynamics model can simulate trajectories which can be utilised for guided exploration for sample-efficient learning, constraint satisfaction for safety guarantees and to provide explanations of agent behaviour - themes Stefan is exploring in his research. With an academic background in chemical engineering after graduating from the University of Birmingham, Stefan has industrial experience in the energy and chemical industries and has recently completed an internship as a Data Scientist.
Stefan’s research is in the field of reinforcement learning, a machine learning paradigm for autonomous sequential decision-making. By interacting in an unknown environment, an agent learns how to act to solve a task from receiving a feedback reward signal. Reinforcement learning is data inefficient with frequently millions of samples required to learn a policy. Model-based reinforcement learning has been shown to improve efficiency by using the collected data to learn a model of the environment dynamics which allows planning by using the model as a simulator. Stefan’s research investigates uncertainty quantification when planning with a neural network or Gaussian process dynamics model. Uncertainty guided exploration directs the agent towards informative areas of the unknown environment which reduces modelling errors and improves performance. Furthermore, the ability to simulate trajectories using a model before executing a policy in the real environment has been utilised to ensure the safety of an agent to satisfy constraints and to explain behaviour for enhanced human-AI interaction.