Privacy-aware neural network classification and training

Project goal

Inventing new encrypted methods for neural network training and classification.

People

Interns

Ali Shahin Shamsabadi and Nitin Agrawal

Supervisors

Matt Kusner, Turing Research Fellow, University of Warwick)
Adria Gascon (Turing Research Fellow, University of Warwick)
Varun Kanade (Turing Fellow, University of Oxford)

Project detail

Neural networks crucially rely on significant amounts of data to achieve state-of-the-art accuracy. This makes paradigms such as cloud computing and learning on distributed datasets appealing. In the former setting, computation and storage are efficiently outsourced to a trusted computing party, e.g. Azure, while in the latter, the computation of accurate models is enabled by aggregating data from several sources.

However, because of regulatory and/or ethical reasons data can’t always be shared. For instance, many hospitals may have overlapping patient statistics which, if aggregated, could produce highly-accurate classifiers. However, this may compromise highly-personal data. This kind of privacy concern prevents useful analysis on sensitive data. To tackle this issue, privacy-preserving data analysis is an emerging area involving several disciplines such as statistics, computer science, cryptography, and systems security.

Although privacy in data analysis is not a solved problem, many theoretical and engineering breakthroughs have made privacy-enhancing technologies such as homomorphic encryption, multi-party computation, and differential privacy related techniques into approaches of practical interest.

However, such generic techniques do not scale to input sizes required for training accurate deep learning models, and custom approaches carefully combining them are necessary to overcome scalability issues. Recent work on sparsifying neural networks and discretising the weights used when training neural networks would be suitable avenues to enable application of modern encryption techniques. However, issues such as highly non-linear activation functions and the requirement for current methods to keep track of some high-precision parameters may inhibit direct application.

The project will focus on both these aspects:

  • Designing training procedures that use only low-precision weights and simple activation functions.
  • Adapting cryptographic primitives, such as those used in homomorphic encryption and multi-party computation, to enable private training on these modified training procedures.

The ultimate goal of the project is to integrate both of these aspects into an implementation of a provably privacy-preserving system for Neural Network Classification & Training.