Dstl challenge: machine learning for understanding code quality


As systems get bigger, software becomes more complicated. As an illustration of this, the Boeing 787 aircraft carries 7 million lines of code. Managing this and checking for code errors has become increasingly difficult and the Defence Science and Technology Laboratory (Dstl)’s challenge for Turing researchers was to find a machine-learning solution that can help improve tools for understanding code quality. The direct benefit of this would be greater control of the growth of bugs in new big systems, which in turn, if applied to the aeroplane example above would mean lower risk of something going wrong, with potentially catastrophic results.

Current static analytic and formal method solutions are ‘noisy’, in the sense that when applied, up to 100,000 errors maybe identified which need to be manually sifted through to identify the most serious errors. They are also costly. Machine learning tools on the other hand, are more sensitive and able to prioritise problems in order of severity.


Before coming to the Data Study Group, John, from Dstl, already had ideas about the potential of what machine learning could achieve. He says: “I knew that machine-learning systems have the capability to detect and learn from patterns in data and that this could be applied to detecting bugs in code. I was hoping that this idea could be built on and that Turing researchers would help me to take it further.”

John says that what at the beginning seemed impossible now appears to be a real possibility. He says: “I didn’t really come with any expectations. I just hoped the researchers would be enthusiastic about engaging with the problem, pull it apart and tackle it with different approaches.”

"I don’t think one person or discipline in isolation could have achieved this [solution]"

'John', Defence Science and Technology Laboratory (Dstl)

The researchers were drawn from backgrounds ranging from machine learning, computer science and deep learning, which meant they all had different ideas about how the challenge could be tackled. John says, “It was great to get that range of experience all in one place. They bounced ideas off each other and this led to the beginnings of a solution. I don’t think one person or discipline in isolation could have achieved this.”

He explains, “This is an area that is sparse in research. There is a lot of potential in this field but not much to see across the academic literature, so we are not yet in a position to expect a full solution. But if we can begin to drive the research, we can begin to drive the solutions.”

Reflecting on the highlights over the five days, he says, “Being able to sit in with the team as they worked was a privilege. Also the speed with which they could jump from problems to solutions was great to see.”

He adds: “I am confident that machine learning and data science will eventually provide the tools to solve the problem of code quality. We knew that technically it is a difficult problem to solve and that it would require a lot of work. However, machine learning tools have the capacity to learn over time which would then enable us to build better future systems more quickly.”

In terms of advice to those who may be interested in collaborating with the Institute in future, John says, “In advance of a Data Study Group prepare the data and the techniques as best you can. It was astonishing to see what’s been achieved, the hours put in and the progress the researchers were able to make, but they only really had three full days to work on the challenge. Good preparation of the data in advance would have allowed more time.”