Abstract

Capturing the purpose of websites

This challenge was provided by the National Cyber Security Centre (NCSC), which is tasked with protecting the UK public sector, and aims at improving existing internet search technologies.

As the internet grows, there is a need in industry, government and academia to better facilitate website recommendation, semantic search, and domain discovery, as well as to improve the security of the web. For instance, it is not possible to easily find all UK public sector domains with a simple search; nor it is possible for commercial organisations to easily generate a list of potential competitors or suppliers/customers from current search engines. An exciting potential approach to enable the NCSC and other organisations to leverage the latest machine learning algorithms on these challenges is to automatically learn a purposeful compact vector representation of every website or domain on the Web.

Citation information

Data Study Group team. (2019, August 13). Data Study Group Final Report: NCSC. Zenodo. http://doi.org/10.5281/zenodo.3367414

Additional information

Abeer ElBahrawy, University of London
Aldo Glielmo, King’s College London
Benjamin Sach, The Alan Turing Institute
Daniel Martin, University of Bristol
Giovanni Colavizza, University of Amsterdam
Lindon Roberts, University of Oxford
Lucas Deecke, University of Edinburgh
Mihai Cucuringu, University of Oxford
Mridul Seth, UCLouvain
Paul Jones, National Cyber Security Centre
Prateek Gupta, University of Oxford
Spiros Denaxas, University College London
Yiliu Wang, London School of Economics and Political Science
Yonatan (Yoni) Dukler, UCLA

Turing affiliated authors