About the event
Edinburgh, 25th-27th November 2015 Organizers: Peter Richtarik, Ilias Diakonikolas, Raphael Hauser, Mark Girolami, John Shawe-Taylor, Artur Czumaj Modern data sets are often too large to be stored on a single machine and are typically distributed among many servers. Surprisingly little is known about how to properly approach massive-scale data analysis tasks in distributed environments. The current practice is to either rely on heuristics with no theoretical guarantees or to apply classical methods which are known to work well in medium-scale. However, heuristics often exhibit unpredictable behaviour, while classical algorithms are often impractical or impossible to apply in a massive-scale domain. Successful approach to “making sense from data” problems at massive scales requires the merging of ideas from multiple scientific disciplines. The “Theoretical Foundations of Big Data Analysis” programme held at the Simons Institute for the Theory of Computing in 2013 (Aug–Dec)—where some of the organizers participated as long-term invited visiting scientists—identified three pillars of big data analysis: computer science, optimization and statistics. Our workshop brings together researchers in these three disciplines as well as industry practitioners, with the aim to review the state of the art, identify key theoretical and practical challenges and outline a programme of future work to be conducted by the Alan Turing Institute in this key area of big data analytics.
Goals:
- Review state-of-the-art algorithms for distributed optimization
- Review state-of-the-art distributed algorithms for key machine learning tasks
- Identify key theoretical challenges in the field, important open problems and most promising avenues for future research and progress
- Identify most burning issues faced by industry and the most promising solutions academia can offer
- Topics: algorithms, modelling, applications, systems, complexity, scalability, big data