Introduction
As data science continues to grow as an industry and research sector, data-driven algorithms such as those required by deep learning - multi-level networks that gradually identify things at higher levels of detail - take up an increasing amount of valuable time and energy in data centres. This provokes a need for computing companies to rethink how they manage the technical challenges caused by this emerging new science.
Explaining the science
In a high-performance computing environment, such as a data centre with hundreds or thousands of interconnected computers, well-designed algorithms - often complex sequences of repeatable steps - allow huge data analysis tasks to be performed. For example, classifying millions of images of tissue samples to identify whether they contain anomalous features that should be examined by a doctor, and give an effective yes/no output.
While these high-performance systems operate well for some computing needs, they often run at less than half their full capacity for many data science tasks. This is because when new computer systems are designed they are typically developed to perform well for a number of model algorithms, which don't include some of the most important data science algorithms used today. At the same time, data science algorithms are designed so as to perform well on existing, rather than potential future, hardware. The absence of feedback between these two design tasks is impeding progress in high-performance, large-scale data analysis.
Project aims
In this five-year project funded by Intel, scientists at the Turing are working to address the technical challenges of data science, through co-designing computer hardware and software. That is, hardware will be designed to suit the needs of data science algorithms, which will similarly be designed to suit the capabilities of the hardware. The research, once complete, promises to dramatically increase the speed and efficiency of data-driven computing tasks and will provide Intel with the tools to build the next generation of computer processors and high-performance systems.