A versatile machine learning image recognition software for automating synchrotron Macromolecular Beamlines

Lead Research Organisation: Diamond Light Source
Department Name: Science Division

Abstract

Macromolecular Crystallography is one of the most used techniques for the study of the most important molecular machines in biology - Proteins - as it allows us to determine the 3D structure of these molecules and infer their function. This is particularly relevant to and has proven results in addressing human diseases ranging from genetic disorders, cancers and fighting of human pathogens. This technique is also used in agricultural and food research areas like the development of novel herbicides or drought resistant crops to address current impacts of climate change. Finally, energy storage and battery technologies have also more recently benefited from crystallography synchrotron instruments helping key manufacturing and clean growth challenges of our era. Crystallography is used by a huge range of researchers from academic to industry pharmacological companies. These researchers often send their samples to large research facilities, like synchrotrons, and then collect X-ray diffraction data remotely or use fully automated systems. With recent advances in synchrotron technology the bottlenecks have moved from the lack of intensity of the synchrotron X-rays or the speed of the detector technology to the hardware and software that makes the sample visible to X-rays by centering the sample and preparing it for data collection. A data collection on a single crystal usually takes less than 10 seconds but all the other tasks bring the time per sample to ~2 minutes. Recent advances in AI have created a paradigm shift in image analysis. There are already a few prototypes in synchrotron facilities outside of the UK using AI to improve the speed and reliability of these essential tasks. We propose to use one of the proven prototypes and further develop it for sample centring, synchrotron X-ray beamline diagnostics, and robot collision risk mitigation. This will be extremely beneficial for the MX beamlines at the UK national Synchrotron - Diamond Light Source (DLS). Many DLS sister facilities can benefit from the application of AI but lack the "know-how" to implement working AI code from scratch. This project aims to bring the technology to the UK but also facilitate the usage of AI in macromolecular crystallography beamlines across the world. Starting by integrating the French national Synchrotron - SOLEIL - trained neural network for sample holder and sample identification into an easily accessible module for use at any synchrotron worldwide would be of huge benefit. This system will then be extended by leveraging our different synchrotron databases of prior images that will be used to train even more advanced models. The coming SwissLight Source (SLS) shutdown at the Paul Scherrer Institute creates an opportunity where their staff are available for collaborations and their planned sabbatical program aligns strongly with our project vision. Finally, this project would help significantly with the roadmap for the Diamond 2 planned upgrade.

Publications

10 25 50