Protocols for private data analysis are known to require large amounts of data in order to provide reasonable utility. This can be alleviated in various ways by requiring more trust on the part of the data provider. This project is exploring novel trust models and how much extra utility can be gained in those models. This will hopefully lead to more accurate private analysis being available, especially to smaller entities who lack very large datasets.
Explaining the science
The primary formalisation of privacy currently considered is that of 'differential privacy'. This is a way of formally guaranteeing that in contributing your data to an analysis you don't cause the result of that analysis to leak much information about you.
There are known limits on how accurate an estimate with this property can be without the data providers having to trust a third party to handle their data or cryptography to protect their data. However with a trusted third party or a large amount of cryptographic computation this accuracy can be substantially improved in many cases. This project is interested in intermediate possibilities.
For example it is known (partially due to early work from this project) that high levels of accuracy can be achieved if you can trust that most other users will follow the protocol properly and that all the data being submitted is shuffled before the analyser sees it. To picture this shuffling imagine each contributor (with identical handwriting) writing a message on a piece of paper and putting all these pieces of paper in a large bin. The bin is then thoroughly shaken before being opened.
This project hopes to create cryptographic protocols to simulate this process with much less cryptographic work than the above mentioned general solutions. It is also developing more models that could be run in this trust model.
This project should result in multiple papers contributing to the study of different trust models and their implementation. This can mean the development of new trust models, the development and analysis of new protocols within these models and/or the development of improved analyses for existing protocols. It can also mean the realisation of new trust models based on established assumptions e.g. by using cryptography to enforce trustworthiness.
This work has the potential to make the gathering of insights from data more privacy preserving in any situation where the data is currently held locally and an analysis of the whole dataset is wanted.