Introduction

The detection and recognition of human actions from real-time CCTV video data streams is a popular challenge, with the potential to aid in video surveillance and anomaly detection of, for example, potentially hazardous scenarios in factories. This project aims to efficiently and effectively address this challenge by developing a generalised framework for interpreting human actions, combining cutting-edge deep learning technologies with ‘path signatures’. The expected outcome is a high-performance, real-time human action recognition and detection system.

Explaining the science

Video-based human action classification is one of the most challenging tasks in computer vision. In this project the ‘path signature’ technique is being used to represent streamed body pose data, as a time-evolving tree structure as seen in the diagram below.

See the related project ‘Capturing complex data streams’ for more information about path signatures.

The path signature provides an effective description of video data over a particular time interval, as it captures the trajectories of actions, and doesn’t vary with the speed at which a video plays. This invariance leads to a significant dimension reduction in representations of video data, so that when then path signature is incorporated into deep learning algorithms, actions can be classified more quickly and accurately.

Initial work has already contributed to state-of-the-art action recognition accuracy by combining the paths signature with shallow neural networks.

Diagram showing how human joint positions are processed

Project aims

The ultimate goal of this project is to develop a generalised framework by incorporating ‘path signatures’ and deep learning to interpret complex multi-dimensional streamed data of human actions.

The underpinning challenges of the project are:

  • Efficient and effective representation of streamed body pose data which can be used generically for understanding and analysing the actions of the body.
  • Efficient and robust ways to combine path signatures with cutting-edge deep learning models to produce state-of-the-art results in action classification.

Incorporating path signatures into deep learning models is expected to extract rich prior knowledge and further boost system performance. The overall proposed framework will be flexible enough to extend to other applications related to motion analysis.

Applications

In a wide range of public or workplace scenarios, for example in subway stations, street crossings, supermarkets, factories and so on, anomaly detection based on human action recognition is crucial to reduce risk from both personal and property safety. Automatic human action detection and recognition systems can aid in real-time CCTV video surveillance and reduce reliance on costly, labour-intensive manual analysis.

In addition, human-machine interaction (HMI) could benefit greatly from human action recognition. Due to the fast development and popularity of various motion sensors, smart devices can capture a wealth of multi-modal streamed data such as colour values, infrared depth, motion acceleration, etc. Interpreting this informative data using the techniques from this project could greatly improve user experiences in HMI.

Furthermore, human action recognition also has the potential to assist in behaviour analysis and athletic rehabilitation. In the rehabilitation training process of injured people, automatic human action analysis can provide auxiliary guidance, evaluate rehabilitation progresses, and prevent secondary injury.

Organisers

Researchers and collaborators

Contact info

For more information, please contact The Alan Turing Institute

[email protected]

Funders