Introduction
Dynamic graphs allow for studying of how relationships form and change over time. Although they have many applications, there are no readily available tools and systems enabling their application. Raphtory is a new distributed system that enables dynamic graph analysis starting from very large real-time datasets. This project will improve the functionality and usability of Raphtory so that it can be readily usable by domain-specific researchers. The project will develop a set of use cases for dynamic graphs, starting with urban analytics for mobility incentives.
Explaining the science
A graph models information as a set of nodes that can be connected by edges. A temporal graph extends this idea by adding a temporal dimension: by recording at what point in time these elements were created, changed or disappeared from the network.
This novel form of analysis can extract new insights from existing data by looking at the dynamic changes on data relationships over time. For example, a typical graph analysis question might use the PageRank algorithm to work out which nodes in a network are the most important. A temporal version of PageRank allows us to investigate which nodes are rising in importance over a variety of timescales (e.g. days, weeks).
Raphtory is a distributed system that takes any source of data (either previously stored, or a real time stream), and creates a dynamic graph that is partitioned over multiple machines. In addition to maintaining this model, graph analysis functions can be defined that will be executed across the cluster nodes, and will have access to the full history of a graph. Raphtory is designed with extensibility in mind; new types of data, as well as new analysis algorithms can be added to the base project, in order to support additional use cases.
Project aims
The project will develop the existing Raphtory software prototype into a system usable by the wide community of data science researchers. The project will collaborate with other existing projects from multiple domains of the Turing and define novel applications of temporal graph analysis that can extract new insight about the dynamics of these environments.
The development of Raphtory will be guided by use cases. Raphtory's functionality will be improved to better support the types of computation required by these scenarios. The documentation and usability of the tool will also be improved so that it is readily accessible to researchers from different domains.
At the end of the project the aim is to create a community within the Turing of dynamic graph practitioners and users and contributors to Raphtory.
Applications
The urban analytics domain is the first target use case for Raphtory. In collaboration with the 'New data forms for transport policies' project, user mobility information regarding their commuting trips as temporal links will be modelled over a dynamic graph, to study changes in behaviour, and abnormal 'routes' chosen by users which might lead to further understanding of the effectiveness of public transport policies.
Additional applications will be explored during the project.