Statistical and computational challenges in large-scale data analysis

Cambridge, 28th-30th September 2015

Main organisers: Graham Cormode, Anthony Lee, Richard Samworth, Rajen Shah, Yee Whye Teh, Patrick Wolfe, Yi Yu

Event Web site:

We are experiencing a data-driven revolution at the moment with data being collected at an unprecedented rate across the sciences and industry. The scale and complexity of these modern datasets often render classical techniques infeasible, and several new methods have been developed within the fields of Statistics and Computer Science to address the challenges posed by the large-scale (and often non-standard) nature of the data. The approaches taken by researchers in these fields are often rather different, with statisticians more concerned with extracting the greatest amount of information given limited data, and computer scientists instead thinking of the computational budget as the primary constraint. In order to develop successful methodology for large-scale data, it is often necessary to draw on ideas from both of these approaches and balance statistical efficiency with computational speed.

In this workshop, we will bring together statisticians and computer scientists working on methodology for large-scale data, as well as researchers working on applications in the sciences and industry. The aim is to map out the Big Data landscape in terms of putting forward the challenges faced by practitioners, and charting the main promising directions in Statistics and Computer Science. The final goal would be to foster collaboration and identify new research directions that require a symbiosis of these fields.