Human behaviour in urban environments is mediated by a combination of habitual and deliberative decision-making, influenced by prior experiences and current perceptions. The outcome of these decisions are the dynamics of flow and activity we observe in cities everyday, and understanding and modelling these behaviours is central to predicting their future evolution. This project aims to advance the integration of reinforcement learning and agent-based simulation in improving predictions of future cities.

Explaining the science

Reinforcement learning (RL) is an algorithmic approach for computational learning. By building a relationship between external observations and rewards, the algorithm constructs a 'policy' for optimising the completion of a given task. Via an iterative learning process, an RL 'agent' builds an understanding of how to complete a task, allowing them to apply the same learning in future unseen environments.

The specification of the RL algorithm has been linked to different forms of human decision-making. Use of a fully or partially observable environment during planning, known as 'model-based' RL, may be representative of deliberative planning; whereas learning without a pre-defined model, 'model-free' RL, may relate more to simple or habitual actions, executed step-by-step. Aside from conceptual similarity, these approaches have theoretical links into neuroscience, analogous to place cell activity during spatial navigation, and dopamine firing responses, respectively.

Another important development has been the emergence of deep learning, and with it deep RL. Deep RL allows for integration of large vectors of input data in predicting rewards, enabling production of complex and strategic behaviour. As has been demonstrated elsewhere, deep RL models are able to exceed the performance of human players of chess, Go, and other complex games, but the potential for these approaches is much greater.

The frontier for this project involves the building of RL and deep RL models of human behaviour within an urban setting. These models, integrated within a multi-agent context, allow for replication and future prediction of complex social and spatial systems, and can therefore in the theory be used to better reflect the evolution of urban systems undergoing change.

Project aims

The aim of the project is to produce a set of novel reinforcement learning (RL)-based approaches for modelling human learning and decision-making in urban environments, and the application of these in predicting urban phenomena. Both model-based and model-free RL will be applied, reproducing different aspects of human behaviour, under habitual and deliberative decision-making scenarios.

The RL models will integrate both static and dynamic spatial data in recreating the use of urban features during movement. Furthermore, the research will explore how RL can be applied in reproducing the variability of human experience and performance.

The implementation of these behavioural models within an 'agent-based' simulation environment allows for the prediction of emergent urban phenomena. The agent-based approach allows for the simulation of multiple autonomous 'agents', each acting on their perceptions and prior knowledge.

An application case study relates to the prediction of crowd dynamics resulting from an emergency or disruption, where individual decision-making is required to switch from habitual to deliberative. By modelling each agent adjusting their decision-making, it's possible to predict the emergence of new patterns of behaviour.


There are a variety of urban systems and processes that could be better understood and predicted through the integration of RL and agent-based modelling (ABM). A clear application area relates to the prediction of pedestrian flow under normal and abnormal or disrupted conditions. In many conventional simulations, navigating agents are assumed to be homogenous, wholly rational individuals, with limited attention to varying experiences and preferences. The integration of RL and ABM allow for the enhanced integration of heterogeneous learning, knowledge and decision-making within simulation space, better reflecting and predicting real-world populations. Understanding pedestrians movements is important in planning new infrastructure design, particularly in public spaces such as airports, transport hubs, and hospitals, and in planning for resilient infrastructure.

More broadly, the development of RL-driven agents has a variety of potential applications in the urban setting. Agent learning models may be potentially developed for predicting crime patterns, public transportation flows, road traffic congestion, economic activity, and health behaviours. Some of these wider applications will be explored later into the project.


Contact info

[email protected]