Artificial intelligence for data analytics (AIDA)
Date: 20 March 2018
Time: 15:30 – 18:00
Venue: The Alan Turing Institute
Organisers: James Geddes, Zoubin Ghahramani, Ian Horrocks, Charles Sutton, Chris Williams
Registration for the event is now closed but you can watch live online
The goal of the Artificial Intelligence for Data Analytics (AIDA) project at The Alan Turing Institute is to develop methods and tools to guide a data analyst through a semi-automated process of acquiring, preparing, integrating, transforming, cleaning and understanding data for analysis. The key insight is that every task in the data science process is a potential application area for artificial intelligence.
A limited number of tickets are available for a set of two public talks.
15:45 – 16:45 Ontology based data access: moving beyond relational data: Diego Calvanese (Free University of Bozen-Bolzano)
Ontology-based data access (OBDA) is a by now well-established paradigm in which users are provided access to a data source through a conceptual view abstracts away details about how the data is organized and stored. This conceptual view is realized through an ontology that is connected to the data source through declarative mappings. In the last decade, OBDA has been studied extensively in the prominent case where the data source being queried is a standard relational database. Advanced tools are available that support query processing in OBDA systems, and OBDA has been applied in many real-world scenarios in industry and in public administrations. However, the need has emerged for extending OBDA to novel types of datasources. In this talk, we deal with three such extensions, namely geo-spatial data, temporal data, and tree-structured data. We discuss the challenges that these extensions pose for OBDA, and we present recent developments on how to incorporate such nove!
17:00 – 18:00 AI assisted data science via probabilistic programming: Vikash Mansinghka (MIT)
The recent successes of artificial intelligence have centered on machine learning problems where data is clean and abundant and where there are unambiguous right answers and wrong answers. But many data science problems involve sparse, messy datasets and questions with intrinsically uncertain answers. This talk will describe new approaches to data science based on probabilistic programming, an emerging field at the intersection of probabilistic modeling and programming languages.
Probabilistic programming is based on the insight that probabilistic models and inference algorithms are a new kind of software, amenable to radical improvements in accessibility, productivity, and scale. This talk will describe probabilistic programming research that aims to (i) improve the productivity of data scientists by automating the process of inferring probable models from data, and (ii) improve the accessibility of data science by making it easy to pose and solve predictive modeling and inferential statistics problems using a simple, SQL-like language. These capabilities will be illustrated using applications to real-world databases of Earth satellites and mental health survey questionnaires.