Turing GeoVisualization Engine

Developing a web-based visual analytics tool for geo-spatial data science

Project status

Ongoing

Introduction

Tools for flexibly loading, representing and querying spatio-temporal data are increasingly demanded by urban planning domains. The 'Turing GeoVisualization Engine' is a web-based visualization tool that allows users to connect to geospatial datasets of varying structure and complexity and flexibly explore patterns by performing spatial, temporal or attribute-level queries through interaction. By making exploratory geospatial data analysis accessible to non-technical users, the Turing GeoVisualization Engine supports planning experts in formulating evidence-based decisions and, when used by the public, contributes to a more informed citizenship.

Explaining the science

This is a software-development project: the primary output is the tool itself. The visual views and interaction mechanisms designed into the tool will be undepinned by empirically-informed guidelines around visual perception from cognitive science and the information visualization domain. Additionally, techniques from geographic information science (GIScience) and related domains will be used when implementing techniques for automatic aggregation of temporal and spatial data.

For example, clutter and occlusion will be inevitable when representing data observations on a map and density-based clustering algorithms may be used to enable zoom-dependent aggregation. Additionally, when representing data items with estimates of spatial and temporal uncertainty, insights from recent perception-based research in cartography and GIScience will be used - for example the use of new visual primitives such as blur and sketchiness to represent this uncertainty.

Project aims

A consequence of the data revolution is a surfeit of geospatial datasets of varying resolution and complexity. These data offer much potential, but require time and technical expertise to process and analyse. Through this project a web-based visual data analysis tool will be developed that enables geospatial datasets to be flexibly loaded, represented and queried. 

On connecting to a dataset, the tool will reason over the data, organising fields according to their data properties. The tool will then suggest visual summaries, or views, based on information visualization guidelines. Users may wish to generate views of their own and then compose and link multiple views on a single screen - for example, a map view, temporal view and thematic view. From here, users will be able to filter specified temporal and spatial extents and thematic categories via interaction. Users will rapidly query geospatial datasets in a highly flexible way. 

Engineering this level of flexibility has many challenges. For example, techniques for automatically aggregating spatial and temporal data will be required for dealing with the fact that datasets will vary in size and density, leading either to visual sparsity or clutter. Data may be recorded with temporal and spatial uncertainty and this uncertainty will need to be reflected within the visualization. These sorts of challenges are not unique to this project and a secondary aim is to propose and implement solutions that are likely to generalise.

The primary output of the project - the tool - will be used by Turing partners, business and government (planning domains) and the public. A secondary output - generalisable techniques for automatic aggregation and uncertainty representation - will bring methodological contributions to the GIScience and information visualization domains.

Applications

Since the Turing GeoVisualization Engine will allow datasets of varying structure and format to be loaded and queried, the tool has the potential to impact a range of domains. 

For example, those working in the transport planning domain might connect to big transport datasets, such as automatic traffic count or rail and metro usage data. The tool might be used to explore spatio-temporal variation in demand at a city-wide scale; alternatively users may focus on a particular event, for example a failure in the transport network, and generate sets of visual summaries to characterise the effects and behavioural responses to this failure elsewhere on the network.

In the environmental monitoring domain, sensor data recording air pollution might be connected and spatio-temporal variation in pollutants studied over a given period of time. By representing temporal pollution signatures within their geospatial position, anomalies might be detected. In this use case, the tool serves as a quick and low cost means of data cleaning and validation.

Funders