Turing data science class: Scalable regression: old tricks, new era

Speaker: Ioannis Kosmidis  (Turing Fellow, University of Warwick)

Date: 25 May 2018

Time: 10:00 – 14:30

Registration: Online registration is compulsory

This class will not be livestreamed or recorded


Regression is at the core of a data scientist’s toolbox and understanding the ways that it can be performed in a scalable manner is important both in terms of general background knowledge and for being able to adapt them to more complicated modelling scenarios that the research students and staff in Turing may be encountering.

This class aims to present core theory, methods and algorithms for tackling regression problems that involve large data sets in terms of number of observations and in terms of explanatory variables. A key learning objective is to identify least squares as a core optimisation problem for fitting linear and generalised linear models, and classify the various methods for its solution or regularised versions of it in terms of complexity, memory usage and accuracy.

Upon completion of the class, the participants will have gained an understanding of a range of incremental bounded-memory and stochastic gradient descent algorithms for large regression problems, and will be able to contrast them in terms of their relative merits, and adopt them and implemented them in various modelling scenarios.

This is an advanced class. Direct pointers and review of required background concepts will be provided, but a fair background on regression modelling, least squares and linear algebra, generally, will be expected.

There will be time for breaks during the session.