A key task for data science is to develop systems through which diverse types of data can be aligned to provide common ground for discovery. These systems determine how data is incorporated into machine learning algorithms and whose perspective is incorporated or excluded from data-driven knowledge systems. Through sustained engagement with stakeholders involved in the development and use of plant data infrastructures, the project investigates current models and future prospects for managing environmental and biological data collected from experiments and field trials around the world. 

Explaining the science

The intersection of historically and sociologically informed philosophy of science and data science, including existing work on bio-ontologies and data visualisation tools, is uniquely equipped to provide understanding of both the technical and social conditions under which data can be mined and reused. This approach will help to develop semantics for the dissemination and linkage of phenotypic data that support a global, context-sensitive and sustainable knowledge base for research on food security and related environmental challenges. The overarching goal of the project will be achieved in the following three ways.

Building an international collaborative network

Facilitating innovative thinking and the pursuit of both technical and regulatory solutions to effective data linkage. The network would bring together data scientists at the Turing with experts in numerous other fields, including:

  • Plant scientists and bioinformaticians at the Earlham Institute and other key biology institutes in the UK, as well as in CGIAR centres for research in tropical agriculture (such as the International Institute for Tropical Agriculture in Ibadan, Nigeria), will provide understanding of specific data resources and local variations in data sharing and reuse.
  • Data curators responsible for the development of the key global infrastructures for the sharing and mining of plant data, such as the Crop Ontology, can provide an overarching view of the challenges involved in setting up data linkage strategies and tools.
  • Representatives of international agencies such as the United Nations (through the Food and Agriculture Organisation) and the CGIAR Big Data Platform, which are responsible for the governance of data semantic systems, can help to relate the technical challenges of data linkage to the overarching social, economic and political challenges of increasing food security in sustainable ways.

Identifying salient differences between existing initiatives

Providing a preliminary analysis of the conceptual, historical and sociological reasons for those differences, so as to develop an understanding of the current state of play that can inform analysis and technical solutions.

Developing a framework

A framework for data linkage in this area that can be applied to produce innovative technical solutions for improving current data linkage strategies and tools.

Project aims

The project aims to provide the building blocks towards investigating the conditions under which plant data can be efficiently and reliably linked across data platforms and infrastructures around the world, in ways that could serve the development of global indicators for the United Nations’ Sustainable Development Goals (particularly SDG 2: Zero Hunger, SDG 3: Good Health and Wellbeing; SDG 10: Reduce Inequalities and SDG 15: Life on Land).  


Having usable, meaningful and reliable ways to link plant data from different sources, species and approaches is essential to data analysis and interpretation, and thus to the development of evidence-based agriculture and governance strategies for food security, as well as to an improved translation between basic and applied research in the plant sciences.  

The project will thus benefit agricultural and farming policies, farming practices, biotechnology R&D, the development of policy over the international transfer and conservation of biological materials and related data, and research addressing food security challenges within the plant sciences. 


Recent updates

16 December: AI between Plant and Agricultural Science: Green Paths towards Environmental Intelligence

This one-day workshop will bring together experts in the plant and agricultural sciences who are working with complex datasets spanning genomic, physiological and environmental data and computational methods of analysis with data scientists interested in the application of cutting-edge technologies to this field. The workshop aims to map future directions for:

  1. Developing and consolidating the Turing Institute’s capabilities in the area of data science for plant science
  2. Integrating plant science within emerging networks of Environmental Intelligence.

Particular emphasis will be given to mapping the current needs of the plant and agricultural science community, in order to establish a guiding framework for the efficient and responsive deployment of data science and artificial intelligence resources in those fields. Short research presentations will be followed by an extended discussion format, which will provide a forum for identifying possible collaborations and developing proposals for project applications.

Download the detailed agenda here.

Plant Data Semantics and Food Security: Incorporating Local Imperatives into FAIR Data Linkage Tools 

Towards Responsible Plant Data Linkage: This workshop series brought together leading researchers from the plant and agricultural sciences with scholars from the history, philosophy and social studies of science to discuss the challenges of plant data linkage.


