Introduction
Speaker 1: Tim Harris - Oracle Laboratories
Speaker 2: Ralf Herbrich - Amazon
About the event
Speaker 1: Tim Harris, Oracle Laboratories
Systems Challenges in Graph Analytics
Graphs are at the core of many data processing problems, whether that is searching through billions of records for suspicious interactions, ranking the importance of web pages based on their connectivity, or identifying possible "missing" friends on a social network. This talk will discuss the challenges in building large, scalable, in-memory graph analytics systems. Many of these challenges come from the way that graph algorithms behave differently based on the structure of the input graph: a planar road network graph can produce a significantly different load on the machine's memory system from a low-diameter social network graph. It can be necessary to select particular algorithms for these different cases, and to make contrasting decisions over how the machine's resources are allocated. Finally, we face challenges simply from the scale at which we operate: making efficient use of the hardware in new SPARC machines with over 4000 threads
Biography
Tim Harris leads the Oracle Labs group in Cambridge, UK. His research interests span multiple layers of the stack, including parallel programming, VMM / OS / runtime-system interaction, and opportunities for specialized architecture support for particular workloads. He has also worked on the implementation of software transactional memory for multi-core computers, and the design of programming language features based on it. Tim has a BA and PhD in computer science from Cambridge University Computer Laboratory. He was on the faculty at the Computer Laboratory from 2000-2004 where he led the department's research on concurrent data structures and contributed to the Xen virtual machine monitor project. He was at Microsoft Research from 2004, and then joined Oracle Labs to found the Cambridge office in 2012.
Speaker 2: Ralf Herbrich, Amazon
Learning Real-World Probabilistic Models with Approximate Message Passing
Over the past few years, we have entered the world of big and structured data – a trend largely driven by the exponential growth of Internet-based online services such in Search, e-Commerce and Social Networking as well as the ubiquity of smart devices with sensors in everyday life. This poses new challenges for statistical inference and decision-making as some of the basic assumptions are shifting: (1) The ability to store the parameters of (data) models, (2) the level of granularity and ‘building blocks’ in the data modeling phase, and (3) the interplay of computation, storage, communication and inference and decision-making techniques. In this talk, I will discuss the implications of big and structured data for Statistics and the convergence of statistical model and distributed systems. I will present one of the most versatile modeling techniques that combines systems and statistical properties – factor graphs – and review a series of approximate inference techniques such as distributed message passing. The talk will be concluded with an overview of real-world problems at Amazon.
Biography
Since November 2012, Ralf works at Amazon as Director of Machine Learning; until August 2013 he worked in Seattle and then in Berlin, Germany. The team works in the area of Forecasting, Content Linkage, Scalable Machine Learning Services and Vision-Assisted Technologies. From October 2011 to November 2012, Ralf worked at Facebook in Palo Alto & Menlo Park leading the Unified Ranking and Allocation team building horizontal large-scale machine learning infrastructure for learning user-action-rate predictors that enabled unified value experiences across the products. From 2009 to 2011, he was Director of Microsoft’s Future Social Experiences (FUSE) Lab UK demonstrating and enabling new social experiences through development of computational intelligence technologies on large online data collections.
From 2006 – 2009, together with Thore Graepel, Ralf was leading the Applied Games and the Online Services and Advertising (OSA) research group which engaged in research at the intersection of machine learning and computer games as well as research in search and online advertising combining insights from machine learning, information retrieval, game theory, artificial intelligence and social network analysis. Ralf joined Microsoft Research in 2000, he obtained both a diploma degree in Computer Science and a Ph.D. degree in Statistics from the Technical University Berlin. Ralf’s research interests include Bayesian inference and decision making, computer games, kernel methods and statistical learning theory. He is one of the inventors of the DrivatarsTM system in the Forza Motorsport series as well as the TrueSkill™ ranking and matchmaking system in Xbox 360 Live. Ralf also co-invented the adPredictor click-prediction technology used in Bing’s online advertising system.