A key task for data science is to develop systems through which diverse types of data can be aligned to provide common ground for discovery. These systems determine how data is incorporated into machine learning algorithms and whose perspective is incorporated or excluded from data-driven knowledge systems. Through sustained engagement with stakeholders involved in the development and use of plant data infrastructures, the project investigates current models and future prospects for managing environmental and biological data collected from experiments and field trials around the world.
Explaining the science
The intersection of historically and sociologically informed philosophy of science and data science, including existing work on bio-ontologies and data visualisation tools, is uniquely equipped to provide understanding of both the technical and social conditions under which data can be mined and reused. This approach will help to develop semantics for the dissemination and linkage of phenotypic data that support a global, context-sensitive and sustainable knowledge base for research on food security and related environmental challenges. The overarching goal of the project will be achieved in the following three ways.
Building an international collaborative network
Facilitating innovative thinking and the pursuit of both technical and regulatory solutions to effective data linkage. The network would bring together data scientists at the Turing with experts in numerous other fields, including:
- Plant scientists and bioinformaticians at the Earlham Institute and other key biology institutes in the UK, as well as in CGIAR centres for research in tropical agriculture (such as the International Institute for Tropical Agriculture in Ibadan, Nigeria), will provide understanding of specific data resources and local variations in data sharing and reuse.
- Data curators responsible for the development of the key global infrastructures for the sharing and mining of plant data, such as the Crop Ontology, can provide an overarching view of the challenges involved in setting up data linkage strategies and tools.
- Representatives of international agencies such as the United Nations (through the Food and Agriculture Organisation) and the CGIAR Big Data Platform, which are responsible for the governance of data semantic systems, can help to relate the technical challenges of data linkage to the overarching social, economic and political challenges of increasing food security in sustainable ways.
Identifying salient differences between existing initiatives
Providing a preliminary analysis of the conceptual, historical and sociological reasons for those differences, so as to develop an understanding of the current state of play that can inform analysis and technical solutions.
Developing a framework
A framework for data linkage in this area that can be applied to produce innovative technical solutions for improving current data linkage strategies and tools.
The project aims to provide the building blocks towards investigating the conditions under which plant data can be efficiently and reliably linked across data platforms and infrastructures around the world, in ways that could serve the development of global indicators for the United Nations’ Sustainable Development Goals (particularly SDG 2: Zero Hunger, SDG 3: Good Health and Wellbeing; SDG 10: Reduce Inequalities and SDG 15: Life on Land).
Having usable, meaningful and reliable ways to link plant data from different sources, species and approaches is essential to data analysis and interpretation, and thus to the development of evidence-based agriculture and governance strategies for food security, as well as to an improved translation between basic and applied research in the plant sciences.
The project will thus benefit agricultural and farming policies, farming practices, biotechnology R&D, the development of policy over the international transfer and conservation of biological materials and related data, and research addressing food security challenges within the plant sciences.
Sabina Leonelli, Contributed talk at IGAD Seminar, Research Data Alliance, Helsinki, October 22-25 2019: “Intelligent Plant Data Linkage: A View from History, Philosophy and Social Studies of Science”
Sabina Leonelli, Invited panellist at Springer/Wellcome Trust Conference “Better Science through Better Data 2019” (#scidata19), panel “Who is afraid of data misuse?”, London, November 6 2019.
Sabina Leonelli, Invited speaker at Sawyer Seminar "Information Ecosystems: Creating Data (and Absence) From the Quantitative to the Digital Age”, Humanities Centre, University of Pittsburgh, November 14 2019: “Data In and Out of Information Ecosystems: Lessons from the Study of Data Journeys”
Sabina Leonelli, Keynote at Cidacs anniversary celebration, Salvador, Bahia, Brazil, December 6 2019: “Placing AI at the service of public health: The role of data management and linkage”
Hugh Williamson and Sabina Leonelli, Invited talk at International Plant and Animal Genome conference (PAG) 2020, Workshop on “Challenges and Opportunities in Plant Science Data Management”, January 11-15 2020, San Diego: “Tracking data linkage for intelligent and responsible reuse”
Sabina Leonelli, Invited speaker at Francis Bacon Conference “Transnational Transactions: Negotiating the Movement of Knowledge Across Borders”, Caltech, Pasadena, February 19-22: “How Data Cross Borders: Globalising Plant Knowledge through Transnational Data Management”
Sabina Leonelli and Hugh Williamson, Invited Webinar, Webinar Series on Agriculture, Research Data Alliance (RDA) & Food and Agriculture Organisation (FAO), April 9: “Intelligent Plant Data Linkage: A View from History, Philosophy and Social Studies of Science”. URL:
Hugh Williamson and Sabina Leonelli, Invited talk, The CogX Global Leadership Summit and Festival of AI & Breakthrough Technology 2020, Session Data Science for Science: “Plant and Agricultural Science”, June 11 (with Hugh Williamson)
Sabina Leonelli, Keynote, Advances in Data Science Conference, University of Manchester, June 22-23 2020: “Intelligent Data Linkage and Distributed Semantics for (Big) Data Interpretation”
Sabina Leonelli, Invited speaker, Synthace industry webinar “The Metadata Responsibility”, 25 August 2020: “Understanding Data Science through Data Journeys”
Hugh Williamson and Sabina Leonelli, Contributed talk, Political Ecology Network Conference POLLEN 2020, September 23 2020: “Speed, Statistics and Speed: Implications of Focusing Plant Breeding on Genetic Gain”.
Invited speaker, Falling Walls Circle Table “The Understanding of the Scientific Method in the 21st Century”, organised by the European Research Council for the World Science Forum, November 5 2020
Sabina Leonelli and Gavin Shaddick, Keynote, Conference “UK-China Tech for Global Good Roundtable: Climate, Clean and Green Tech”, British Embassy in Beijing, October 28 2020: “AI for the Green Agenda: What Data Landscape Do We Need?”
Sabina Leonelli Keynote, Society for Computation in Psychology, virtual annual conference, November 19 2020: ”Prospects for the Automation of Research: Reproducibility and Human Agency”.
Sabina Leonelli, Invited lecture, Data Ethics Webinar Series, University of Oregon, December 9: “Data Science in Times of Pan(dem)ic: From FAIR Data to Fair Data Use” (delivered remotely)
Hugh Williamson and Sabina Leonelli, Contributed talk, International FAIR Convergence Symposium, Session “Plant Data Semantics and Food Security: Incorporating Local Imperatives into FAIR Data Linkage Tools”, Paris, November 30 2020: “FAIR Data and Climate-Adaptive Plant Breeding”
Williamson H, Brettschnedier J, Caccamo M, Davey R, Goble C, Kersey PJ, May S, Morris RJ, Ostler R, Pridmore T, Rawlings C, Studholme D, Tsaftaris S and Leonelli S.* Data management challenges for artificial intelligence in plant and agricultural research [version 1; peer review: awaiting peer review]. F1000Research 2021, 10:324 (1)
Leonelli, S. (2022, in press) How Data Cross Borders: Globalising Plant Knowledge through Transnational Data Management and Its Epistemic Economy. In: Krige, J (ed) Writing the Transnational History of Knowledge Flows in a Global Age. Chicago, IL: University of Chicago Press.
Beaulieu, A. and Leonelli, S. (in press, 2021) Data and Society: A Critical Introduction. London, UK: SAGE.
The Alan Turing Institute (2021) Report “Data science and AI in the age of COVID-19”
Leonelli S, Lovell B, Fleming L, Wheeler B and Williams H. (2021) From FAIR data to fair data use: Methodological data fairness in health-related social media research. Big Data and Society 8 (1) DOI: 10.1177/20539517211010310
Krige, J and Leonelli, S (2021) Mobilizing the Translational History of Knowledge Flows: COVID-19 and the Politics of Knowledge at the Borders. History and Technology.
Leonelli, S. (2021) Data Science in Times of Pan(dem)ic. Harvard Data Science Review.
Arnaud E, Laporte MA, Kim S, Aubert C, Leonelli S, Cooper L, Jaiswal P, Kruseman G, Shrestha R, Buttigieg PL, Mungall C, Pietragalla J, Agbona A, Muliro J, Detras J, Hualla V, Rathore A, Das R, Dieng I, King B (2020) The Ontologies Community of Practice: An Initiative by the CGIAR Platform for Big Data in Agriculture. Patterns 1: 100105
Geraint P, Yoselin BA, Gibbs D, Grant M, Harper A, Harrison J, Kaiserli E, Leonelli S, May S, McKim S, Spoel S, Turnbull C, van der Hoorn R, Murray J (2020) How to Build an Effective Research Network: Lessons from Twenty Years of the GARNet Plant Science Community. Journal of Experimental Botany, eraa307,
DATA TOGETHER (2020) Open Science for a Global Transformation. Data Together Response to UNESCO Consultation on Open Science. 29 pages.
Leonelli, S (2020) Scientific Research and Big Data. The Stanford Encyclopaedia of Philosophy (Summer 2020 Edition), Edward N. Zalta (ed.).
Leonelli, S. (2019) Data – From Objects to Assets. Nature 574, 317-321. DOI: 10.1038/d41586-019-03062-w
Leonelli, S. (2019) Data Governance is Key to Interpretation: Reconceptualising Data in Data Science. Harvard Data Science Review, inaugural issue.
Leonelli, Sabina (2020) big data; plurality; neutrality; AI; Coronavirus; pandemic; relativism; open science. In: The Index of Evidence, 61
Leonelli, Sabina. (2020). Opening the Research Process: From Publications to Data, and Back Again. Zenodo.
Cousins T, Leonelli S, Pentacost M, Rajan KS. (2020) Situating the Biology of COVID-19: A Conversation on Disease and Democracy. The India Forum
Leonelli S and Williamson H. (2020) Intelligent plant data linkage: A view from history, philosophy and social studies of science [version 1; not peer reviewed]. F1000Research 2020, 9:260
16 December: AI between Plant and Agricultural Science: Green Paths towards Environmental Intelligence
This one-day workshop will bring together experts in the plant and agricultural sciences who are working with complex datasets spanning genomic, physiological and
- Developing and consolidating the Turing Institute’s capabilities in the area of data science for plant science
- Integrating plant science within
emerging networks of Environmental Intelligence.
Particular emphasis will be given to mapping the current needs of the plant and agricultural science community, in order to
Plant Data Semantics and Food Security: Incorporating Local Imperatives into FAIR Data Linkage Tools at the International FAIR Convergence Symposium, November 30 2020: “FAIR Data and Climate-Adaptive Plant Breeding”
Towards Responsible Plant Data Linkage: This workshop series brought together leading researchers from the plant and agricultural sciences with scholars from the history, philosophy and social studies of science to discuss the challenges of plant data linkage. More information and video of workshop presentations can be accessed at our website. Proceedings will be collected into an edited volume for publication in 2022.