Artificial and Augmented Intelligence for Automated Scientific Discovery

Lead Research Organisation: University of Southampton
Department Name: Sch of Chemistry

Abstract

AI is a widely used term that conjurers up many of the computers from science fiction. Its stands for a whole collection of ideas, algorithms, computational models and knowledge systems. Recent success of particular types of machine learning (e.g. deep neutral nets) have again excited the interest of the scientific community in delivering insight into the complexity of the real world. This type of approach compliments the knowledge engineering systems that have previously been used, however they require massive amounts of data to be trained. Taking the chemical and materials sciences as exemplar areas we can see that the traditional approaches to scientific discovery work with relatively small amounts of often uncertain data which is distilled by human insight to yield predictions and testable theories which may evolve as new data becomes available. In these areas of science more data is becoming available and the impact of 'larger data' parallels the reality that almost all science now depends on computational assistance. Never-the-less the quantity of quality data needed to train the new AI systems is simply not directly available even with recent advances in automation. As a basis for the network we propose to use 'amplification by simulation' as a key element of the cycle of automated experiments, simulation, AI learning, prediction, comparison, design, further experiments, to create the environment in which leading AI developments can be applied to the chemical and materials discovery.

Planned Impact

The network partners for the proposed Network+ provide an extensive reach to the academic and industrial communities in the UK interested in research in computing and the physical and life science providing access to the relevant people to create impact throughout the whole community. The cutting edge areas of chemical and material science to be support with cutting edge AI, mean that successful research projects will be of immediate importance to industry giving rise directly to new pharmaceuticals, antimicrobials, green production techniques, functional materials and more importantly they will yield new techniques to generate future critically needed chemicals and materials.

The membership of the network will mean that relevant researchers and industrialists will get to hear about the work though the network and be able to support applications of the work. Through the network connections we will have access to all the major players in the chemical and materials areas and the nature of the proposed network is such that we will also be able to bring on board the SMEs, providing opportunities for them to interact with academic groups and other commercial entities. The network will also engage with the media and governmental (regulatory) agencies as the potential role of AI in driving UK productivity and innovation is a current and significant public agenda.

Open Science, quality software, supported by the network will lead we believe to greater open innovation in this space. Combining the best of open source software with open source hardware and wetware. We see this as a very effective underpinning of wider public engagement, for example in supporting presentations and discussions at Schools and public events, to be able to demonstrate to the public what AI approaches offer to science and through these activities to the public.
 
Title AI3SD Video: A Career in Chemistry & Beyond 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 28 external views in addition to being part of our Skills4Scientists Series. This was a collaboration between AI3SD and PSDS. 
URL https://eprints.soton.ac.uk/451125/
 
Title AI3SD Video: A vision of Medicinal Chemistry for the future 
Description Historically, medicinal chemistry and computational chemistry have been separate roles within a drug discovery organization requiring differing backgrounds and expertise. Increasingly in the modern world and going forwards these skillsets are coming closer and closer together, empowered by automation and increasingly advanced computational methods. Over time, there has been an increase in the amount of time a medicinal chemist spends at a computer compared with time in the lab which is expected to continue. For computational chemists, historically, the core skills were of computer science, not of chemistry due to the lack of well developed tools and computing power. As time goes on, with increases in the user friendliness of available tools and increases in automation, these roles are likely to move closer together, with others joining the field with more mixed skillsets too. The role of automation will likely increase allowing for easier synthetic tasks to be carried out by robots, and the more trivial thinking tasks to be carried out by a computer, leaving human scientists to perform more challenging laboratory work, and think about the most important problems. This presentation will discuss the likely necessary skills of the future, compared to those of today, and the role automation has to play in this transition. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 179 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453340/
 
Title AI3SD Video: AI and multi-omics discovery science: A case study in understanding ageing at a systems level 
Description Metabolism is central to all processes of life and the metabolome -- large-scale measurement of the quantities of small molecular entities in cells and tissues -- gives a readout of cellular functioning at a point in time. Harnessing metabolomic information together with transcriptomic information about gene expression allows for multi-level insights into genetic dysregulation and its cellular effects. I will describe a multi-omics approach based on genome-scale modelling that is able to integrate the two levels and provide insights into the systems-level deregulation of cellular function due to ageing by transforming the cellular reaction space into a constraint-based linear optimisation problem. Metabolic models such as these and their interpretation depends on publicly available data about small molecular metabolites. Chemical ontologies provide structured classifications of chemical entities that can be used for navigation and filtering of chemical space including in metabolic models. ChEBI is a prominent example of a chemical ontology, widely used in life science contexts including to annotate metabolites in genome-scale models, and recent work has involved using deep learning to automatically extend the ChEBI classification to a wider range of metabolites thus enhancing the benefit of genome-scale models for ageing systems research. Finally, I will discuss the role of artificial intelligence technologies in systems-level -omics research more generally. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 67 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468647
 
Title AI3SD Video: AI and optimisation in Computational Chemistry 
Description Numerical methods of optimisation are vital in chemistry, ranging from finding the lowest energy of a molecule through to reactor design. This talk will discuss two examples of how we are using optimisation techniques to work towards automating the discovery of new materials and developing computational chemistry methodology. The first of these is an AI3SD funded project where we set out to develop new membranes for water desalination using artificial intelligence techniques. In the second, we will demonstrate how the basis sets used in quantum chemistry calculations can be optimised in an automated fashion. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 94 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468646
 
Title AI3SD Video: AI for Science: Transforming Scientific Research 
Description There is now broad recognition within the scientific community that the ongoing deluge of scientific data is fundamentally transforming academic research. Turing Award winner Jim Gray referred to this revolution as â??The Fourth Paradigm: Data Intensive Scientific Discoveryâ??. Researchers now need tools and technologies to manipulate, analyze, visualize, and manage vast amounts of research data. This talk will begin by reviewing the challenges posed by the explosive growth of experimental and observational data generated by large-scale facilities such as the Diamond Synchrotron and the CryoEM Facilities at the Rutherford Appleton Laboratory. Increasingly, scientists are beginning to use sophisticated machine learning and other AI technologies both to automate parts of the data pipeline and also to find new scientific discoveries in the deluge of experimental data. In particular, â??Deep Learningâ?? neural networks have already transformed several areas of computer science and research scientists are now exploring their use in analyzing their â??Big Scientific Dataâ??. The talk concludes with a vision of how this â??AI for Scienceâ?? agenda can be truly transformative for experimental scientific discovery. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 176 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447159/
 
Title AI3SD Video: AI insights from billions of dollars of ready-cleaned data 
Description Two of the greatest pain points in Artificial Intelligence (AI)-assisted research workflows are data quantity and data organisation. Estimates place 60-80% of time in data science workflows is simply cleaning and arranging the data, dependent on researcher skill and the type of data. As AI relies on pattern recognition, the larger the dataset, the more likely the algorithm is to recognise a useful pattern. Due to organised unput via the Studies ELN and the underlying architecture of the database, extracting AI-ready data is made simple. We hold billions of dollars' worth of data for our clients, and by working with each organisation to show them the untapped potential that existing projects already hold, then future research can be designed with these methodologies in mind to further boost research turnover and outcomes. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 36 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/470020
 
Title AI3SD Video: Accelerating design of organic materials with machine learning and AI 
Description Deep learning is revolutionizing many areas of science and technology, particularly in natural language processing, speech recognition, and computer vision. In this talk, we will provide an overview of the latest developments of machine learning and AI methods and application to the problem of drug discovery and development at Isayev's Lab at CMU. We identify several areas where existing methods have the potential to accelerate materials research and disrupt more traditional approaches. First we will present a deep learning model that approximates the solution of Schrodinger equation. We introduce the AIMNet-NSE (Neural Spin Equilibration) architecture, which can predict molecular energies for an arbitrary combination of molecular charge and spin multiplicity. The AIMNet-NSE model allows to fully bypass QM calculations and derive the ionization potential, electron affinity, and conceptual Density Functional Theory quantities like electronegativity, hardness, and condensed Fukui functions. We show that these descriptors, along with learned atomic representations, could be used to model chemical reactivity through an example of regioselectivity in electrophilic aromatic substitution reactions. Second, we proposed a novel ML-guided materials discovery platform that combines synergistic innovations in automated flow synthesis and automated machine learning (AutoML) method development. A software-controlled, continuous polymer synthesis platform enables rapid iterative experimental-computational cycles that resulted in the synthesis of hundreds of unique copolymer compositions within a multi-variable compositional space. The non-intuitive design criteria identified by ML, which was accomplished by exploring less than 0.9% of overall compositional space, upended conventional wisdom in the design of 19F MRI agents and led to the identification of >10 copolymer compositions that outperformed state-of-the-art materials. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 146 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/452737/
 
Title AI3SD Video: Accelerating structure prediction models for materials discovery 
Description The discovery of new functional materials can be guided by computational screening, particularly if the structure of a material can be reliably predicted from its chemical composition. For this application, we have been developing the use energy-structure-function maps [1], which summarise the crystal structures available to a given molecule and the relevant properties that are predicted for these structures. The use of these methods is still limited by the computational cost of crystal structure prediction (CSP). Most of the cost of CSP is associated with the calculation of the relative energies of predicted crystal structures using energy models that are sufficiently accurate to provide reliable energetic rankings. To speed up these methods, we have been developing machine learning approaches to predict high quality energies (e.g. from solid state density functional theory) from structures that have been generated with computationally efficient energy models [2-4]. The talk will discuss the performance of these methods, which use Gaussian Process Regression based on descriptors of local environments of atoms within crystal structures. I will also describe how these descriptors can be used to more quickly navigate the structure-property landscapes of molecular crystals [5] and how fast CSP can be applied to screen chemical space for the most promising molecules for a given application [6]. [1] Functional materials discovery using energy-structure-function maps, A. Pulido et al, Nature 2017, 543, 657. [2] Machine learning for the structure-energy-property landscapes of molecular crystals, F. Musil, S. De, J. Yang, J. E. Campbell, G. M. Day and M Ceriotti, Chem. Sci. 2018, 9, 1289-1300. [3] Machine-Learned Fragment-Based Energies for Crystal Structure Prediction, D. McDonagh, C.-K. Skylaris and G. M. Day, J. Chem. Theory Comput. 2019, 15, 2743-2758 [4] Multi-fidelity Statistical Machine Learning for Molecular Crystal Structure Prediction, O. Egorova, R. Hafizi, D. C. Woods and G. M. Day, J. Phys. Chem. A 2020, 124, 39, 8065-8078. [5] Distributed Multi-Objective Bayesian Optimization for the Intelligent Navigation of Energy Structure Function Maps For Efficient Property Discovery, E. Pyzer-Knapp, G. M. Day, L. Chen, A. I. Cooper, ChemRxiv 2020, https://doi.org/10.26434/chemrxiv.13019960.v1 [6] Evolutionary chemical space exploration for functional materials: computational organic semiconductor discovery, C. Y. Cheng, J. E. Campbell and G. M. Day, Chem. Sci. 2020, 11, 4922-4933. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 484 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/447673/
 
Title AI3SD Video: All's Fair in love and data management 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #1 - Research Data Management Session, which focussed on several areas of good data management practices. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 46 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL http://eprints.soton.ac.uk/id/eprint/450266
 
Title AI3SD Video: An Open Competition of People and Machines to Develop Predictive Models for Antimalarial Drug Discovery 
Description One of the most promising series within the Open Source Malaria (OSM) consortium involves compounds that are active in the in vivo model of the disease. A molecular mechanism of action is strongly implicated, and is a mechanism shared with several leading antimalarials in the drug development pipeline, but no crystal structure has been obtained for the protein target. This OSM project is in the lead optimisation phase, with small changes being made to the structures synthesised. Yet even now many compounds designed by the human chemists are proving to be inactive, which can be wasteful of project resources. Over the last several years the consortium has run open competitions to see if the broader community can derive more predictive models for which molecules to synthesise. The most recent, funded by AI3SD, elicited high quality, open submissions from academia and several new companies specialising in artificial intelligence and machine learning. To close the loop, and examine the utility of these predictions, several of the novel structures proposed were synthesised and evaluated in a blood stage antimalarial assay. Were the machine-assisted predictions better than those derived from human intuition? An answer will be provided. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 144 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448781/
 
Title AI3SD Video: Applying Machine Learning to Structured Time-course sensor data 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the third talk in the ML4MC series and formed part of the session "Research Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 49 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450667/
 
Title AI3SD Video: Artificial Intelligence for Safer Urban Space 
Description The ever-growing adoption of big data technologies, smart sensing, data science and artificial intelligence is enabling the development of new intelligent urban spaces with real-time monitoring and advanced cyber-physical situational awareness capabilities. The advancement of cyber-physical situational awareness is experimented for achieving safer smart city spaces in Europe and beyond. The deployment of digital twins leads to understanding real-time situation awareness and risks of potential physical and/or cyber-attacks on urban critical infrastructure specifically. The critical extraction of knowledge using digital twins, which ingest, process and fuse observation data and information, prior to machine reasoning can also be performed. In this cyber behavior detection modules, which identify unusualness in cyber traffic networks can be deployed together with a physical behaviour detection module, based on computer vision and statistical methods. The two modules function within the so-called Malicious Attacks Information Detection System (MAIDS) digital twin. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 77 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453345/
 
Title AI3SD Video: Artificial Intelligence's new clothes? From General Purpose Technology to Large Technical System 
Description Artificial Intelligence (AI) is expected to be characterised by wide applicability; for this reason, it has been quickly labelled a General Purpose Technology (GPT). In this paper, we critically assess whether AI is really a GPT. Provided that the answer is 'not exactly', we suggest that an alternative framework - drawn from the literature on large technical systems (LTS) - could be useful to understand the nature of AI. AI, in its current understanding, is a 'system technology' - a collection of techniques built and enabled by the conjunction of many sub-systems. From this premise, we try the fundamental building blocks of LTS on AI to provide new insights on its nature, goal orientation, and the actors and factors playing a role in enabling or constraining its development. Thinking in terms of AI LTS can help researchers to identify how control is distributed, coordination is achieved, and where decisions take place, or which levers actors (among which policy makers) can pull to relax constraints or steer the evolution of AI. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 197 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447295/
 
Title AI3SD Video: Assembling peptide and protein structures 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the fifth talk in the ML4MC series and formed part of the session "Research Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 73 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450668/
 
Title AI3SD Video: Audacity of huge: Machine Learning for the discovery of transition metal catalysts and materials 
Description I will discuss our efforts to use machine learning (ML) to accelerate the computational tailoring and design of transition metal complexes and metal-organic framework (MOF) materials. One limitation in a challenging materials space such as open shell, 3d transition metal chemistry is that ML models and ML-accelerated high-throughput screening traditionally rely on density functional theory (DFT) for data generation, but DFT is both computationally demanding and prone to errors that limit its accuracy in predicting new materials. I will describe three ways weâ??ve overcome these limitations: i) through efficient global optimization to minimize the numbers of calculations carried out to obtain design rules in weeks instead of decades while satisfying multiple objectives; ii) through machine-learned consensus from a family of dozens of functionals to more robustly uncover new materials; and iii) by the use of natural language processing to extract, learn, and directly predict experimental measures of stability on heterogeneous MOF materials. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 166 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453344/
 
Title AI3SD Video: Automated Chemical Ontology Expansion 
Description Ontologies provide a shared vocabulary and semantic resource for a domain. Manual construction enables them to achieve high quality and capture subtle semantic nuances, essential for wide acceptance and applicability across a community. However, the manual curation process does not scale for large domains. I will present a methodology for automatic ontology extension based on deep learning using ontology annotations, and show how we apply this methodology to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We used a Transformer-based deep learning architecture trained on the chemical structures from ontology leaf nodes, and the system learns to predict membership in multiple mid-level ontology classes as a multi-class classification task. Additionally, I will illustrate how visualizing the modelâ??s attention weights can help to explain the results by providing insight into how the model made its decisions. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 208 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/451925/
 
Title AI3SD Video: Automated Rational Design of Metal-Organic Polyhedra 
Description Metal-organic polyhedra (MOPs) are hybrid organic-inorganic nanomolecules, whose rational design depends on harmonious consideration of chemical complementarity and spatial compatibility between two or more types of chemical building units (CBUs). In this work, we apply knowledge engineering technology to automate the derivation of MOP formulations based on existing knowledge. For this purpose we have: i) curated relevant MOP and CBU data; ii) developed an assembly model concept that embeds rules in the MOP construction; iii) developed an OntoMOPs ontology that defines MOPs and their key properties; iv) software tools that populate the knowledge graph; and v) algorithm that using information from the knowledge graph derive a list of new constructible MOPs. Our result provides rapid and automated instantiation of MOPs in the knowledge graph, unveils the immediate chemical space of known MOPs, and sheds light on new MOP targets for future investigations. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 47 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468644
 
Title AI3SD Video: Building your professional contacts - Networking for Scientists and/or Introverts 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the third talk in the Skills4Scientists #6 - Careers 1 Session, which focussed on on several areas of careers advice that will be useful to you as you complete your studies and begin your careers. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 23 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/451153/
 
Title AI3SD Video: Calibrated deep representations and entropy based active learning for materials property prediction 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the ninth talk in the ML4MC series and formed part of the session "Mentor Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 37 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450849/
 
Title AI3SD Video: Capturing and Tracking your Outputs 
Description This talk will look at how to capture a wealth of different outputs. Gone are the days when journal papers were the only items to be considered outputs, there are so many different types of outputs that can be produced when working as part of a Network or large scale project. These can include videos, reports, presentations, posters, interviews and much more. This talk will look at creating templates for some of these items, how to capture these, different methods for storing and sharing them, and how to collate together all of your outputs in your institutional repository. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 20 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/457085/
 
Title AI3SD Video: Chemical Space Exploration 
Description We explain why genetic algorithms can find molecules with particular properties in an enormous chemical space (ca 10^60 molecules) by considering only a tiny subset(typically 10^3â??6 molecules). I show how genetic algorithms can be used to optimise optical, binding, and catalytic properties of molecules, while ensuring synthetic accessibility. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 138 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453341/
 
Title AI3SD Video: Collaborative Data Management 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the second talk in the Skills4Scientists #1 - Research Data Management Session, which focussed on several areas of good data management practices. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 50 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL http://eprints.soton.ac.uk/id/eprint/450268
 
Title AI3SD Video: Collaborative Reports/Presentations 
Description his talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the fourth talk in the Skills4Scientists #5 - Posters, Presentations & Reports Session, which focussed on several areas of communication for your research; presentations, posters and reports. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 18 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450846/
 
Title AI3SD Video: Combining robotics and Machine Learning for accelerated drug discovery 
Description Artificial intelligence has an increasing impact on drug discovery and development, offering opportunities to identify novel targets, hit, and lead-like compounds in accelerated timeframes. However, the success of any AI/ ML model depends on the quality of the input data, and the speed with which in silico predictions can be validated in vitro. The talk will cover laboratory automation and robotics and the benefits they offer in terms of quality and speed of data generation synergise with AI/ ML-powered drug discovery approaches. The talk will cover some of the general trends in the industry, and also highlight successfully implemented case studies that show the how the combination of robotics and AI/ ML lead to accelerated project timelines and superior research outputs. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 89 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/452735/
 
Title AI3SD Video: Cross-architecture tuning of quantum devices faster than human experts 
Description A concerning consequence of quantum device variability is that the tuning of each qubit in a quantum circuit constitutes a time-consuming non-trivial process that has to be independently performed for each device, requiring a deep understanding of the particular device to be tuned and "muscle memory". I will show a machine-learning based approach that can tune quantum devices completely automatically, regard less of the device architecture and being agnostic to the material realisation. Our algorithm was able to tune double quantum dot devices defined in a Si FinFET, a Ge/Sicore/shell nanowire, and both SiGe and AlGaAs/GaAs heterostructures, successfully accommodating the different modes of gate operation and noise characteristics. We report tuning times as fast as 10 minutes starting from scratch - well over an order of magnitude faster than what would be achievable by a dedicated expert human operator. Just as AlphaZero showed that the achievements of AlphaGo could be extended to learning to win at different board games without needing to be reprogrammed for each, so our result shows that cross-architecture tuning of quantum devices can be achieved using machine learning. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 53 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468643
 
Title AI3SD Video: Cultivating your Web Presence 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the second talk in the Skills4Scientists #6 - Careers 1 Session, which focussed on on several areas of careers advice that will be useful to you as you complete your studies and begin your careers. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 28 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450840/
 
Title AI3SD Video: DNA: coding blocks for biocompatible assembly & disassembly 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the second talk in the ML4MC series and formed part of the session "Research Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 31 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450666/
 
Title AI3SD Video: Data Analysis Case Study 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the third talk in the Skills4Scientists #4 - Intro to Python 2 Session, which was a follow on from our Intro to Python 1 course, with a focus on working further with the core elements of Python and performing data analysis, using Jupyter notebooks and Anaconda. This course is designed to allow you to follow along with the content and examples as the course goes, but you will also be provided with course material to allow you to cover it again after the live event. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 78 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450568/
 
Title AI3SD Video: Data Generation, Data Standards and Metadata Capture in Drug Discovery 
Description Biomedical research and drug discovery are based on a continuous cycle of scientific findings being made, refined, and translated into new treatments. However, over recent years it has become clear that only a fraction of all published research findings are actually reproducible, causing waste and delays in our efforts to bring new drugs to patients. The answer is changing the way we generate and capture data, including experimental metadata. Especially in light of the increasing role of Artificial Intelligence in drug discovery, it is critical to rethink the way we approach data generation as the most important input for AI-driven drug discovery. The talk will address these recent advances in data and metadata capture based on fully automated experimentation and novel data standards. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 166 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447526/
 
Title AI3SD Video: Data legislation, personal and non-personal data, ethical issues and protecting your IP rights 
Description Scientists and researchers handle vast amounts of data in the course of their work, and in recent years technology and computational power has revolutionised the ability to create, store and analyse data. Scientific research increasingly now requires scientists to be skilled in computational analysis and the ability to work with algorithms, as they deal with larger and larger data sets. The handling of data in science brings with it legal considerations in relation to data protection and intellectual property rights. This talk will give an overview of the data legislation that applies to scientists and researchers including guidance from data protection authorities and decisions taken by courts, the differences between personal and non-personal data, the ethical issues involved in the use of algorithms and how you can protect your intellectual property rights. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 99 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL http://eprints.soton.ac.uk/id/eprint/447091
 
Title AI3SD Video: Data management: at the root of high-throughput experimentation 
Description High-throughput experimentation (HTE) is an enabling technology that has had major effects on efficiency in small-molecule industrial chemistry, particularly pharma and agorchemicals. Data management and curation can be perhaps should be a guiding strategy for building up advantageous HTE capabilities, with futureproofing for goals including machine learning for reaction optimization. Beyond HTE, rapid access to analytical and project data enables chemists in any industrial role to make not only faster, but better decisions. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 111 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/469332/
 
Title AI3SD Video: Data publication - a personal tale 
Description In this talk, I will discuss the theory and practice of data publication both from the perspective of an academic journal editor, but also as a scientific researcher who created datasets, and who got scooped. I'll touch on the importance of data management and data citation, and give an overview of how data publication has grown over the past years, and where we want to be heading in the future. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 47 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447368/
 
Title AI3SD Video: Data-Driven Molecular Design in Computational Toxicology 
Description Timely drug discovery and toxicology approaches have seen a rise in strategies which use data as a basis for decisions at various stages. Such approaches include (automated) data integration and curation efforts, predictive machine learning approaches, as well as structure-based molecular design strategies that make use of the wealth of publicly available data sources and data types. In this talk, various computational workflows which have been developed in my lab for addressing research questions related to toxicology will be presented. In one project, ligand- and structure-based methods have been combined in an effective data-driven manner to decipher the molecular basis of ligand recognition and selectivity for hepatic Organic Anion Transporting Polypeptides (OATPs). In the framework of this successful project, novel highly potent inhibitors of these SLC uptake transporters have been identified by an AI-driven virtual screening approach. At the other end of the spectrum, we are using target-agnostic information if the underlying mechanism of toxicity is insufficiently understood. Such approaches allow to leverage in vivo data for building predictive machine learning models but they also make the incorporation of in vitro bioactivity data possible. Another example will illustrate how data integration strategies can be used to consolidate Adverse Outcome Pathway (AOP) hypotheses, which are effective tools in toxicology and risk assessment to capture mechanistic knowledge of critical toxicological effects that span over different layers of biological organization. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 86 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453519/
 
Title AI3SD Video: Data-driven materials discovery for functional applications 
Description Large-scale data-mining workflows are increasingly able to predict successfully new chemicals that possess a targeted functionality. The success of such materials discovery approaches is nonetheless contingent upon having the right data source to mine, adequate supercomputing facilities and machine-learning workflows to calculate or sample a large range of materials, and algorithms that suitably encode structure-function relationships as datamining workflows which progressively short list data toward the prediction of a lead material for experimental validation. This talk shows how to meet these data-science requirements via 'chemistry-aware' natural language processing, image recognition and machine learning developments using case studies to showcase their successful application to data-driven materials discovery. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 270 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448778/
 
Title AI3SD Video: Deep Learning Enhanced Quantum Chemistry: Pushing the limits of Materials Discovery 
Description Atomistic simulation based on quantum mechanics (QM) is currently being revolutionized by machine-learning (ML) methods. Many existing approaches use ML to predict molecular properties from quantum chemical calculations. This has enabled molecular property prediction within vast chemical compound spaces and the high-dimensional parametrization of energy landscapes for the efficient molecular simulation of measurable observables. However, as all properties derive from the QM wave function, an ML model that is able to predict the wave function also has the potential to predict all other molecular properties. In this talk, I will explore ML approaches that directly represent wave functions and QM Hamiltonians and their derivatives for developing methods that use ML and QM in synergy. [1] Using examples from molecular dynamics [1] and heterogeneous catalysis, [2,3] I will discuss the challenges associated with encoding physical symmetries and invariance properties into deep learning models. Upon overcoming these challenges, integrated ML-QM methods offer the combined benefits of big-data-driven parametrization and first-principles-based methods. I will discuss several opportunities associated with building ML-augmented quantum chemical methods, including Inverse Chemical Design based on ML-predicted wave functions and the development of efficient and accurate semi-empirical methods to study hybrid metal-organic materials. [4] [1] KT Schütt, M Gastegger, A Tkatchenko, K-R Müller & RJ Maurer, Nature Communications 10, 5024 (2019). [2] Y Zhang, RJ Maurer, and B Jiang, J. Phys. Chem. C 124, 186-195 (2020); [3] Y Zhang, RJ Maurer, H Guo, and B Jiang, Chem. Sci. 10, 1089-1097 (2019). [4] M Gastegger, A McSloy, M Luya, KT Schütt, RJ Maurer, J. Chem. Phys. 153, 044123 (2020). 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 153 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448773/
 
Title AI3SD Video: Deep Learning enhanced prediction of protein structure and dynamics 
Description Proteins exist in several different conformations. These structural changes are often associated with fluctuations at the residue level. Recent findings showed that co-evolutionary analysis coupled with machine-learning techniques improved the prediction precision by providing quantitative distance predictions between pairs of residues. The predicted statistical distance distribution from the Multi Sequence Analysis (MSA) revealed the presence of different local maxima suggesting the flexibility of key residue pairs. Here we investigate the ability of the residue-residue distance prediction to provide insights into the protein conformational ensemble. We combine deep learning approaches with mechanistic modeling to a set of proteins that experimentally showed conformational changes. The predicted protein models were filtered based on their energy scored, RMSD clustered, and the centroids locally refined. The models were compared to the experimental-Molecular Dynamics (MD) relaxed structure by analyzing the backbone residue torsional distribution and the sidechains orientations. Our pipeline not only consents us to retrieve the global experimental folding but also the experimental structural dynamics due to local and global conformational changes. Based on the insight of this study we are proposing a protocol that allows the in-silico investigation of protein dynamics suited for pharmacological research on catalysis and molecular recognition. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 146 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450159/
 
Title AI3SD Video: DeepDock: a deep learning approach to predict ligand binding conformations 
Description Understanding the interactions formed between a ligand and its molecular target is key to guide the optimization of molecules. Different experimental and computational methods have been key to understand better these intermolecular interactions. In this talk I will describe DeepDock, a method based on deep learning that is capable of predicting the binding conformations of ligands to protein targets. Overall, this method performs similar or better than well-established scoring functions for docking and screening tasks. Result presented in this talk are an example of how artificial intelligence can be used to improve structure-based drug design. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 383 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450162/
 
Title AI3SD Video: Design Fiction as a Method, and why we might use it to consider AI. 
Description AI is a fast moving field that is rapidly advancing and becoming embedded in a multitude of sectors and applications. With such a fast pace, and excitement over the possibilities it allows, there is often a rush to get things going. This being the case, sometimes not enough time is spent considering the implications and unforeseen outcomes that might come from the introduction of new technologies, processes and practices. Ideas that seem plausible and useful can turn out to be problematic when actually implemented, by which time it is often too late. By using the methodology of speculative design, we can more closely examine these implications and outcomes before the technologies become a reality. This talk will introduce speculative design and give some examples of design fiction, a method wherein objects from fictional futures or alternate presents are created to provoke discussion and explore possibilities. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 122 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447293/
 
Title AI3SD Video: Designing molecular models by machine learning and experimental data 
Description The last years have seen an immense increase in high-throughput and high-resolution technologies for experimental observation as well as high-performance techniques to simulate molecular systems at a microscopic level, resulting in vast and ever-increasing amounts of high-dimensional data. However, experiments provide only a partial view of macromolecular processes and are limited in their temporal and spatial resolution. On the other hand, atomistic simulations are still not able to sample the conformation space of large complexes, thus leaving significant gaps in our ability to study molecular processes at a biologically relevant scale. We present our efforts to bridge these gaps, by exploiting the available data and using state-of-the-art machine-learning methods to design optimal coarse models for complex macromolecular systems. We show that it is possible to define simplified molecular models to reproduce the essential information contained both in microscopic simulation and experimental measurements. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 118 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450164/
 
Title AI3SD Video: Digitising your Chemistry for Recordability, Shareabilty and Reproducibility 
Description Mark's talk focused on three specific areas as to how you can digitise your data and workflows to improve your productivity, increase discovery of your data and make your research more reproducible. These tips were broken down into smaller areas in which you could implement them, with examples taken from the chemistry and life sciences domains. The three main areas which Mark included in his tips were: binning the old fashioned write up, collecting data throughout the whole experiment and sharing your data in accessible and transferable formats. The talk gave examples using the Digital GlasswareTM products offered by DeepMatter, in addition to ways to incorporate the tips in different systems. Mark concludes his talk by commenting on the large proportion of science that is currently irreproducible and the ways in which human interaction introduces opportunities for error. These tips aim to resolve these issues, increasing the reproducibility of science and reducing the errors. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 200 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447531/
 
Title AI3SD Video: Dimensionality in chemistry: using multidimensional data for machine learning 
Description In the last hundred years mankind has fully absorbed the idea of multi-dimensional space, starting with 4D space time. Due to the increase in computational power, scientists can now manipulate molecules in 4D (3D vibrating molecules in VR) and work with multidimensional datasets, which are needed to utilize big data and machine learning. However, our intuition from 3D space can fall down when dealing with higher dimensions and a lack of intuition can lead to mistakes in analysis. In this talk I will discuss how to think about the best dimensional space to use to describe chemical problems, how multi-dimensional space is different, techniques for using it and analysing the outputs of machine learning. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 215 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447294/
 
Title AI3SD Video: Directed Assembly of Materials - A 50 year retrospective 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the first talk in the ML4MC series and was part of the session: "All about Directed Assembly" 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 92 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450665/
 
Title AI3SD Video: Discovery of synthesisable organic materials 
Description The computational discovery of new materials with useful properties is currently hindered by the difficulty in transitioning from a computational prediction to synthetic realisation. Attempts at experimental validation are often time-consuming, expensive, and frequently, the key bottleneck of material discovery.[1] Porous organic cages (POCs) have been discovered as a possible alternative material for molecular separations, catalysis, and sensing applications.[2] For POCs, a priori property prediction is possible,[3] however, it can be time-consuming and computationally expensive to explore a large number of possible candidate molecules. Despite being able to predict materials with exceptional properties, it is often challenging to predict whether it is possible to synthetically realise a potential candidate compound. In the field of drug discovery, machine learning techniques have been able to readily distinguish between synthesisable and unsynthesisable molecules, accelerating the drug discovery process.[4] Incorporating a synthetic accessibility scoring function into the precursor selection process favoured less complex, synthetically accessible precursors; thus, bridging the gap between computational screening and experimental synthesis of POCs. Using data-driven synthetic accessibility scoring techniques and high-throughput experimentation, we developed a POC screening workflow to accelerate discovery of experimentally realisable POC candidates, which we demonstrate using high-throughput, automated experimentation. Existing measures of synthetic accessibility are often tailored towards predicting synthesisable drug-like molecules, whose synthetic requirements often do not align with those of materials discovery programs. By redefining synthetic accessibility as a classification problem, we were able to develop an alternative model able to predict synthesisable materials precursors.[5] Biasing towards easy-to-synthesise precursors facilitated the synthesis of several precursors predicted to form shape-persistent POCs. Using these novel precursors, we were able to construct a precursor library able to be combined using automation, enabling the accelerated discovery of POCs and the construction of an experimentally derived POC reaction dataset. Using this dataset, we aim to develop a model able to predict POC formation, a question challenging to address using conventional computational methods. [1] Szczypinski, F. T.; Bennett, S.; Jelfs, K. E. Can We Predict Materials That Can Be Synthesised? Chem. Sci. 2021, 12 (3), 830-840. [2] Hasell, T.; Cooper, A. I. Porous Organic Cages: Soluble, Modular and Molecular Pores. Nat Rev Mater 2016, 1 (9), 16053. [3] Greenaway, R. L.; Jelfs, K. E. High-Throughput Approaches for the Discovery of Supramolecular Organic Cages. ChemPlusChem 2020, 85 (8), 1813-1823. [4] C. W. Coley, L. Rogers, W. H. Green and K. F. Jensen, SCScore: Synthetic Complexity Learned from a Reaction Corpus, J. Chem. Inf. Model., 2018, 58, 252-261 [5] Bennett, S., Szczypinski, F.T., Turcani, L., Briggs, M.E., Greenaway, R.L., and Jelfs, K.E. (2021) Materials Precursor Score: Modelling Chemists' Intuition for the Synthetic Accessibility of Porous Organic Cage Precursors. J. Chem. Inf. Model., 61 (9), 4342-4356. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 37 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/470004
 
Title AI3SD Video: Drug Repositioning for COVID-19 
Description Pandemics, such as Covid-19. are by definition essentially unanticipatable and rapid onset. Features unfortunately incompatible with current industry capabilities in drug discovery. This has led to a large number of studies, both theoretical and experimental to reposition, or reuse an existing drug for Covid-19 therapy. There are some general patterns of success in historical repositioning that point to the most likely strategies for drug repositioning, and also, following some specific data gathering and curation, to point towards specific actionable activities for Covid-19. The presentation will briefly overview drug repositioning as a general strategy, and then the focussed application of core concepts towards the treatment of Covid-19. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 679 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/446448/
 
Title AI3SD Video: Equality, Diversity & Inclusion in Networks: Developing your inclusive approach 
Description This session will give delegates a framework in which to consider how they can develop an inclusive approach to their work, team and research. This talk will provide some examples of how to do this and inspire delegates to develop their own approaches. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 25 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468642
 
Title AI3SD Video: Ethical data management - balancing individual privacy and public benefit 
Description This talk will cover aspects of ethical data management, focussing on the key issues of participant consent, data minimisation, and data anonymisation, using examples from health sciences and engineering. Content within the talk aims to cover: big picture issues (societal benefits to data sharing versus individual right to privacy), relevant legislation (GDPR, DPA 2018 and FoIA 2000), what happens when things go wrong, managing risk via informed consent, data minimisation and anonymisation (formal, statistical and functional) and best practice guidelines and tools. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 54 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL http://eprints.soton.ac.uk/id/eprint/447090
 
Title AI3SD Video: Event detection in single-molecule data - how to find molecular signatures without (too many) prior assumptions 
Description Data from single-molecule experiments, such as from current-time or conductance-distance spectroscopy or sensors, are often "noisy" and characterised by complex molecular behaviour. In some cases, extracting the physically relevant information may be based on supervised approaches, i.e. where labelled data are available for training. In other cases, such data are either not available or it may simply be undesirable to make a priori assumptions about the molecular characteristics, for example to prevent loss of information and expectation bias.[1,2] This may require unsupervised methods or alternative approaches that put an emphasis on "what is not background?", rather than "what does an event look like?". In my talk, I will discuss some of the approaches we have taken, including some based on image recognition networks (AlexNet, VGG16),[3,4] and show those can be used to extract not only physically meaningful characteristics, but also previously unknown molecular behaviour. [1] M. Lemmer et al., "Unsupervised vector-based classification of single-molecule charge transport data", Nat. Commun. 2016, 7, art. no. 12922 [2] T. Albrecht et al., "Deep learning for single-molecule science", Nanotechnol. 2017, 28, 423001. [3] A. Vladyka, T. Albrecht, "Unsupervised classification of single-molecule data with autoencoders and transfer learning", Machin. Learn.: Sci. Technol. 2020, 1, 035013. [4] C. Weaver et al., "Unsupervised Classification of Voltammetric Data with Image Recognition and Dimensionality Reduction" (in preparation) 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 44 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468641
 
Title AI3SD Video: Explainable Machine Learning for Trustworthy AI 
Description Black box AI systems for automated decision making, often based on machine learning over (big) data, map a userâ??s features into a class or a score without exposing the reasons why. This is problematic not only for the lack of transparency, but also for possible biases inherited by the algorithms from human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. The future of AI lies in enabling people to collaborate with machines to solve complex problems. Like any efficient collaboration, this requires good communication, trust, clarity and understanding. Explainable AI addresses such challenges and for years different AI communities have studied such topic, leading to different definitions, evaluation protocols, motivations, and results. This lecture provides a reasoned introduction to the work of Explainable AI (XAI) to date, and surveys the literature with a focus on machine learning and symbolic AI related approaches. We motivate the needs of XAI in real-world and large-scale application, while presenting state-of-the-art techniques and best practices, as well as discussing the many open challenges. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 124 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/451923/
 
Title AI3SD Video: Finding Small Molecules in Big Data 
Description The environment and the chemicals to which we are exposed is incredibly complex, with over 111 million chemicals in the largest open chemical databases, 300,000 estimated in global inventories of high use, and over 70,000 in household use alone. Detectable molecules in environmental samples, metabolomics and exposomics can now be captured using high resolution mass spectrometry (HRMS), which provides a â??snapshotâ? of all chemicals present in a sample and allows for retrospective data analysis through digital archiving. However, there is no â??one size fits allâ? analytical method, and scientists cannot yet identify most of the tens of thousands of features in each sample, let alone associate them with health or disease, leading to critical bottlenecks in identification and data interpretation. Defining the chemical space to search is a huge challenge, especially considering that chemicals transform in both organisms (metabolism) and the environment (both biotic and abiotic processes). This talk will cover European and worldwide community initiatives and resources to help find and identify small molecules and their metabolites (transformation products) - from compound databases to spectral libraries, from literature mining to transformation prediction. It will show how FAIR and Open interdisciplinary efforts and data sharing can facilitate research in many areas of small molecule research. Various contributors to this massive collaborative effort will be acknowledged throughout the talk. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 78 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453352/
 
Title AI3SD Video: Finding new in silico-based therapeutic strategies for IAHSP 
Description Infantile-onset ascending spastic paralysis (IAHSP) is a neurodegenerative autosomic recessive rare disease which affects less than 50 people worldwide. The pathogenesis starts in early childhood, with a progressive degeneration of the upper spinal motoneuron, progressively hindering deambulation until spread to the upper limbs and to the involuntary musculature(1). As it often occurs for rare diseases, although few interest from the pharma compartment, some information regarding this condition are available from case reports: key events responsible for this condition are mutations to the gene ALS2, which encodes for the cell trafficking-related protein alsin. Nevertheless, the relatively broad mutational landscape and the low number of reported cases still make a complete understanding of the physiopathology and the search for suitable therapeutic strategies pretty challenging. The majority of mutations described in literature result in a truncated form of alsin which is reputed to be degraded, thus depicting a scenario of loss-of-function pathogenesis. Nevertheless, some patients report missense mutation, leading to non-degraded, mutated forms. In those cases, the majority of amino-acid (aa) substitutions occur in the N-terminal RLD domain, essential for alsin localization to the plasma membrane and eventually to early and late endosomes upon activation of the RAC1 pathway. In endosomes, alsin binds to the small GTPase Rab5 and performs a guanosin-exchange factor activity (GEF) through its C-terminal VPS9 domain2. This pathway is reputed to be the major strategy that mammalian cells follow, in order to assemble endosomes and exchange materials within the cell architecture. In dimensionally important cells such as motoneurons, coordinated and efficient cell trafficking results crucial for correct development and function maintenance. Alsin exists in cytoplasmic solution as tetramer, firstly assembled by parallel dimerization through the VPS9 domain and subsequently by interaction of two dimers through their DH/PH domain, located upwards of the VPS9 region2. The first challenge that such a broad mutational landscape offers is that different mutations correspond to different multimers. These states do not just affect stability and solubility, but also subcellular localization and GEF activity. To make this situation more challenging, there is no experimentally-resolved 3D structure of alsin and a homology modeling effort to build the whole protein seems questionable because of the lack of a reliable template. In contrast with the majority of reports, here we present a patient case harboring two alsin mutations in the C-terminal region: one allele translates a frame-shifted, truncated form which gets degraded. The other allele is harboring the R1611W aa substitution in the VPS9 domain. With the aid of in silico computational tools, we managed to predict the 3D structure of normal and mutated forms of this domain. Moreover, we characterized physiologic and pathologic dimerization modes, discovering that mutated VPS9 preferentially forms an antiparallel dimer by interacting with the aforementioned RLD domain. We could link this discovery to the experimentally-determined loss of tetrameric aggregation and, more important, to the incorrect endosome localization. This finding corroborates and gives a mechanistic explanation for the experimentally-characterized reduced Rab5 GEF endosomal activity. Furthermore, we performed an in silico virtual screening, repurposing an already commercialized drug which is able to shield the pathologically-acquired hydrophobic moiety. In our hypothesis, this mechanism of action re-establishes physiological dimerization mode, subcellular localization and Rab5 activity in R1611W-mutated patients. The candidate is currently under pre-clinical testing in an alsin R1611W cellular model. Our hope and the scope of our effort is duplex: first, we want to provide a reliable treatment for alleviating symptoms and disease progression to our patient. Second, we would like to broaden the knowledge in the field and, by integrating in silico and in vitro procedures, establish a lean research pipeline that might once serve as mutation-based platform for individual drug repurposing for the treatment of alsin-related diseases.References:Lesca, G. et al. Infantile ascending hereditary spastic paralysis (IAHSP): Clinical features in 11 families. Neurology 60, 674â??682 (2003).Sato, K. et al. Altered oligomeric states in pathogenic ALS2 variants associated with juvenile motor neuron diseases cause loss of ALS2-mediated endosomal function. J. Biol. Chem. 293, 17135â??17153 (2018). 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 51 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450163/
 
Title AI3SD Video: Fireflies-Lévy Flights algorithm for peptides conformational optimization 
Description Over the last 50 years, several algorithms and approaches were introduced and improved to tackle the challenges of exploring a large and multidimensional conformational space. Optimisation algorithms are frequently used to guide the search in a conformational space of complex molecules such as proteins. It is a crucial step to access molecular properties corresponding to the most stable conformer. The optimisers are usually buried in docking software with limited tuning possibilities. We implement a Fireflies algorithm with Lévy flights distribution to search for the lowest energy conformations of peptides. The hyperparameters of this bio-inspired metaheuristics algorithm are tuned and its performance is compared with the state-of-the-art method. Our results show that the Fireflies-Lévy flights algorithm is able to improve upon the genetic algorithm method with fewer energy evaluations. To the best of our knowledge, this is the first cheminformatics application that will open the door to additional nature-inspired metaheuristics to support the conformational analysis of large biomolecules. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 107 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450210/
 
Title AI3SD Video: General Effects of AI on Drug Discovery 
Description The advent of gradually more effective AI/ML techniques is already having effects on the traditional practices of medicinal chemistry and drug discovery in general. What can we expect as the process goes on, and how will drug discovery scientists have to adjust their thinking and their research roles? 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 1248 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450173/
 
Title AI3SD Video: Generating a Machine-Learned Equation of State for Fluid Properties 
Description Equations of state (EoS) for fluids have been a staple of engineering design and practice for over a century. Available EoS are based on the fitting of a closed-form analytical expression to suitable experimental thermophysical data of fluids. The mathematical structure and the underlying physical model significantly restrain the applicability and accuracy of the resulting EoS. This contribution explores the issues surrounding the substitution of machine-learned models for analytical EoS. In particular, we describe, as a proof of concept, the effectiveness of a machine-learned model to replicate the statistical associating fluid theory (SAFT-VR Mie) EoS for pure fluids. To quantify the effectiveness of machine-learning techniques, a large set of pseudodata is obtained from the EoS and used to train the machine-learning models. We employ artificial neural networks and Gaussian process regression to correlate and predict thermodynamic properties such as critical pressure and temperature, vapor pressures, and densities of pure model fluids; these are performed on the basis of molecular descriptors. The comparisons between the machine- learned EoS and the surrogate data set suggest that the proposed approach shows promise as a viable technique for the correlation and prediction of thermophysical properties of fluids. This work opens a pathway for employing classical molecular simulations with classical force fields as feeder of pseudo-data of fluids in the search for ML physical property prediction. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 82 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448772/
 
Title AI3SD Video: GitHub & LaTeX Demo 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the third talk in the Skills4Scientists #3 - Version Control and LaTeX Session, which focussed on focus on teaching the basics of LaTeX and version control. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 92 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450565/
 
Title AI3SD Video: Giving your Open Data the best chance to realise its potential 
Description Chris is not a researcher, but he's worked with a lot of them over 23 years at the University of Southampton. He's seen a lot of hard work on open data fail to achieve the potential it could have. A common issue is that rather than face the reality of what's going wrong, it's easier to invest more in the aspects of your dataset and data service that are good than to identify and fix aspects that are bad. Such "hygiene factors" don't have to be perfect but they must all be good enough. Failure in any one may lead to failure of your data to achieve its potential, no matter how well you do on other factors. Chris will give some examples of the most common open data hygiene factors, and some tips from the public sector open data community on how to address them pragmatically. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 65 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447527/
 
Title AI3SD Video: H2020 Project Onto Trans 
Description The OntoTrans project responds to the need of industry to respond to manufacturing challenges more efficiently by accessing the relevant information and utilising materials modelling more effectively. In particular, there is a need to strengthen the use of translation as a router supporting end users to get to relevant data and models. OntoTrans will provide a general-purpose ontology-based Open Translation Environment (OTE) and End User Apps for four innovation challenges, delivering smart guidance for materials producers and product manufacturers through all the steps of the translation process. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 277 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448774/
 
Title AI3SD Video: Hints and tips for optimising your researchfish data 
Description Gavin Reddick (Chief Analyst at Interfolio UK) will talk about practical things that researchers and universities can do to reduce the amount of time and effort needed to complete your annual reporting to funders via researchfish platform, as well as helping to ensure more accurate and useful data is collected. He will also touch on how you can get from researchfish and things you might want to use it for. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 29 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL http://eprints.soton.ac.uk/id/eprint/457241
 
Title AI3SD Video: How can Explainable AI help scientific exploration? 
Description Although models developed using machine learning are increasingly prevalent in scientific research, their opacity poses a threat to their utility. Explainable AI (XAI) aims to diminish this threat by rendering opaque models transparent. But, XAI is more than just the solution to a problem--it can also play an invaluable role in scientific exploration. In this talk, I will consider different techniques from Explainable AI to demonstrate their potential contribution to different kinds of exploratory activities. In particular, I argue that XAI tools can be used (1) to better understand what a "big data" model is a model of, (2) to engage in causal inference over high-dimensional nonlinear systems, and (3) to generate algorithmic-level hypotheses in cognitive science and neuroscience. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 109 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/451912/
 
Title AI3SD Video: How to detect unexpected features & physical processes in single-molecule data 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the sixth talk in the ML4MC series and formed part of the session "Research Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 38 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450670/
 
Title AI3SD Video: Hyperparameter Optimisation for Graph Neural Networks 
Description Traditional deep learning has made significant progress on various problems, from computer vision to natural language processing. For graph problems, there are still many challenges. Graph neural networks (GNNs) have been proposed for a wide range of learning tasks in the graph domain. In particular, in recent years, an increasing number of GNN models were applied to model molecular graphs and predict the properties of the corresponding molecules. However, a direct impediment to achieve good performance with the lower computational cost is to select appropriate hyperparameters. Meanwhile, many molecular datasets are far smaller than many other datasets in typical deep learning applications. Most hyperparameter optimization (HPO) methods for deep learning have not been explored in terms of their efficiencies on such small datasets in the molecular domain. We conducted theoretical analyses for popular HPO methods (random search, TPE, and CMA-ES) and proposed a genetic algorithm with hierarchical evaluation strategy and tree-structured mutation for HPO. Finally, we believe that our work will motivate further research to GNNs as applied to molecular machine learning problems and facilitate scientific discovery. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 68 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453342/
 
Title AI3SD Video: InChI: Measuring the Molecules 
Description The IUPAC international chemical identifier, InChI, provides a way to name molecules. It is defined by an open algorithm that transforms molecular structures into unique strings of text. Each molecule should have exactly one InChI, and each InChI should correspond to exactly one molecule. This property makes it a useful tool in the management of chemical information, and it is widely used. The InChI Trust and IUPAC are continuing to work on developing the standard and on creating new tools which are built on the InChI. This talk will outline how the InChI is used now, and how this may develop in the future. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 1584 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447425/
 
Title AI3SD Video: Inference from Medical Images: Subspaces for Low Data Regimes 
Description Unlike in the field of visual scene recognition, where tremendous advances have taken place due to the availability of very large datasets to train deep neural networks, inference from medical images is often hampered by the fact that only small amounts of data may be available. When working with very small dataset problems, of the order of a few hundred items of data, the power of deep learning may still be exploited by using a model pre-trained on natural images as a feature extractor and carrying out classic pattern recognition techniques in this feature space, the so-called few-shot learning problem. In regimes where the dimension of this feature space is comparable to or even larger than the number of items of data, dimensionality reduction is a necessity and is often achieved by principal component analysis or singular value decomposition (PCA/SVD). In this paper, noting the inappropriateness of using SVD for this setting we explore two alternatives based on non-negative matrix factorization (NMF) and discriminant analysis. Using 14 different datasets spanning 11 distinct disease types we demonstrate that at low dimensions, discriminant subspaces achieve significant improvements over SVD and the original feature space. We also show that at modest dimensions, NMF is a competitive alternative to SVD in this setting. Joint work with Jiahui Liu, Keqiang Fan and Xiaohao Cai. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 49 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468640
 
Title AI3SD Video: Interpretable Machine Learning for Materials' Design and Characterisation 
Description "Where is the knowledge we have lost in information?" T.S. Eliot, The Rock Machine learning (ML) and artificial intelligence (AI) are the subjects of wildly differing opinions on utility and potential impact. Depending who we talk to ML is the solution to almost every human challenge, from open boarders to pandemic control, or presents an existential crises for the species. In materials science the polarisation is perhaps less extreme, but nonetheless pervasive, while the numbers of ML related works experiences an explosion one highly respected theoretical chemist recently pronounced "[a]t least 50% of the machine learning papers I see regarding electronic structure are junk". A part of the issue that many detractors have with ML methods is related to their perception of the techniques as "black-box" approaches, at the same time, the same lack of understanding limitations of the models leads to some of the more outlandish boosterism surrounding the subject. In this talk I will discuss, with examples from our work, how we can open up the black-box of ML methods, highlighting and understanding limitations, increasing trust in results, and potentially improving the methods themselves. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 126 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468639
 
Title AI3SD Video: Interpretable machine learning for materials design and characterization 
Description In a plenary lecture at a recent international conference, one leading researcher in theoretical chemistry remarked "at least 50% of the machine learning papers I see regarding electronic structure theory are junk, and do not meet the minimal standards of scientific publication", specifically referring to the lack of insight in many publications applying ML in that field. But is knowledge inevitably lost in machine learning studies, if not how can it be extracted and how does this apply to machine learning in the context of materials science? In this talk I will look at how we can open up black box machine learning models, to understand the results and gain confidence in predictions. I will present topical examples from designing new dielectric crystals, understanding inelastic neutron scattering data and trusting deep neural networks for tomographic reconstruction. By understanding how and why these models work, we can trust the results and even discover new physical relationships. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 144 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448780/
 
Title AI3SD Video: Interpreting opacity: understanding gaps in our explanations of artificial neural networks 
Description We know everything that goes on within artificial neural networks. We tend to know of all the data such systems have been trained on. And designers will be aware of the various design decisions, training algorithms and techniques that went into their construction, too. At the same time, leading AI designers tell us that their systems are in some sense uninterpretable, inexplicable or opaque. That's puzzling. Drawing on discussions in the philosophy of neuroscience and science more generally, I will make use of this puzzle to try to advance our understanding of what explanations we lack with respect to ANNS; hence the nature and scope of explanation. The puzzle helps us to distinguish different phenomena in need of explanation, and some limits to the mechanistic explanatory strategies so often helpfully employed in the cognitive neurosciences. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 44 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/470014
 
Title AI3SD Video: Intro to Ethics 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #7 - Ethical Research event. This talk provided a brief intro to some of the areas of consideration when looking at the ethical conduct of research. This is not any form of formal ethics training and if you wish to learn more about ethics you should contact the relevant departments at your institution. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 31 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 500 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/451137/
 
Title AI3SD Video: Intro to LaTeX 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #3 - Version Control and LaTeX Session, which focussed on focus on teaching the basics of LaTeX and version control. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 117 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450562/
 
Title AI3SD Video: Introducing the Future Blood Testing Network 
Description The Future Blood Testing Network+ is a new Network funded by EPSRC. We are aiming to build a multi-disciplinary community to develop digital health technologies for remote, rapid, affordable and inclusive monitoring and personalised analytics. This presentation will introduce our Network, detailing our plans for the next three years, in particular highlighting the funding calls and opportunities that will be relevant to the AI4SD Community. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact This talk introduced the Future Blood Testing Network to the AI4SD network community, through this the Future Blood Testing Network gained some new members. It received 78 views. 
URL https://eprints.soton.ac.uk/id/eprint/468627
 
Title AI3SD Video: Introduction to Git 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the second talk in the Skills4Scientists #3 - Version Control and LaTeX Session, which focussed on focus on teaching the basics of LaTeX and version control. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 76 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450563/
 
Title AI3SD Video: LaTeX in Overleaf 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the fourth talk in the Skills4Scientists #3 - Version Control and LaTeX Session, which focussed on focus on teaching the basics of LaTeX and version control. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 67 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450567/
 
Title AI3SD Video: Learning to Control Quantum Systems Robustly 
Description Quantum control provides methods to steer the dynamics of quantum systems. The robustness of such controls, in addition to high fidelity, is important for practical applications due to the presence of uncertainties arising from limited knowledge about system and control Hamiltonians, initial state preparation errors, and interactions with the environment leading to decoherence. We introduce a novel robustness measure based on the Wasserstein distance, and discuss structured singular value analysis and log-sensitivity approaches from classical robust control. This is employed to analyse the robustness of controllers found by reinforcement learning and gradient-based optimisation algorithms. Some, not all, high-fidelity controllers are also robust and controllers found by reinforcement learning appear less affected by noise than those found by gradient-based optimisation. We briefly discuss applications in information transfer in spin networks and magnetic resonance spectroscopy. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 86 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453337/
 
Title AI3SD Video: Lessons learned from generative models of biological sequences 
Description De novo protein design for catalysis of any desired chemical reaction is a long-standing goal in protein engineering because of the broad spectrum of technological, scientific and medical applications. However, mapping protein sequence to protein function is currently neither computationally nor experimentally tangible. Here, I will present a recently develop ProteinGAN approach, a self-attention-based variant of the generative adversarial network that is able to â??learnâ?? natural protein sequence diversity and enables the generation of functional protein sequences. ProteinGAN learns the evolutionary relationships of protein sequences directly from the complex multidimensional amino-acid sequence space and creates new, highly diverse sequence variants with natural-like physical properties. Using malate dehydrogenase (MDH) as a template enzyme, we show that 24% (13 out of 55 tested) of the ProteinGAN-generated and experimentally tested sequences are soluble and display MDH catalytic activity in the tested conditions in vitro, including a highly mutated variant of 106 amino-acid substitutions. ProteinGAN therefore demonstrates the potential of artificial intelligence to rapidly generate highly diverse functional proteins within the allowed biological constraints of the sequence space.Talk is based on recently published work:Repecka, D., Jauniskis, V., Karpus, L. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat Mach Intell 3, 324â??333 (2021). https://doi.org/10.1038/s42256-021-00310-5 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 203 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450161/
 
Title AI3SD Video: Linked Data - Examples and Heuristics 
Description With their inherent flexibility and robustness to change, the decentralised interconnected knowledge graphs that lie at the heart of semantic web technologies are ideally suited for the challenges of converting the messy, often incomplete, and internally heterogeneous datasets of the Humanities into machine processable data. Although a matter of some debate, the reuse and adoption of known ontologies, schema, and taxonomies across disparate projects across the Arts, Humanities, and Social Sciences landscape has been steadily increasing over the last decade in particular. This talk will describe the practical approaches and heuristics of such Linked Data projects, commenting on the effect of political, institutional, and socio-cultural factors in their planning, implementation, and evaluation. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 78 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447528/
 
Title AI3SD Video: Love notes to the future: the importance of metadata 
Description Isobel's talk was focused on the importance of good research data management and how this can pay off in the future. This talk discussed four main aspects of data management: The data management plan, data storage, finding your data and sharing your data. Good data management is really important. Ultimately, managing your data well will save you time and effort in the future, making it easier to find, use, and distribute to others later on. A core part of data management is the data management plan, which should set out the full plan for what data is going to be gathered, how it is going to be catalogued and in what formats it is going to be stored in (paper/electronic/physical data). 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 119 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447529/
 
Title AI3SD Video: ML1: Mathematical Foundations for ML 
Description This video is the first of Mahesan Niranjan's 5 part lecture series on Machine Learning for our Summer School. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 71 external views in addition to being part of our Hybrid Machine Learning Summer School that provided training to over 100 students. 
URL https://eprints.soton.ac.uk/469865/
 
Title AI3SD Video: ML2: Estimation with Machine Learning 
Description This video is the second video of Mahesan Niranjan's 5 part lecture series on Machine Learning for our Summer School. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 87 external views in addition to being part of our Hybrid Machine Learning Summer School that provided training to over 100 students. 
URL https://eprints.soton.ac.uk/469866/
 
Title AI3SD Video: ML3: Classification and Clustering 
Description This video is the third of Mahesan Niranjan's 5 part lecture series on Machine Learning for our Summer School. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 29 external views in addition to being part of our Hybrid Machine Learning Summer School that provided training to over 100 students. 
URL https://eprints.soton.ac.uk/469867/
 
Title AI3SD Video: ML4: Linear Regression to Perceptron Convergence 
Description This video is the fourth of Mahesan Niranjan's 5 part lecture series on Machine Learning for our Summer School. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 37 external views in addition to being part of our Hybrid Machine Learning Summer School that provided training to over 100 students. 
URL https://eprints.soton.ac.uk/469868/
 
Title AI3SD Video: ML5: Radial Basis Functions and Multi-Layer Perceptrons 
Description This video is the fifth of Mahesan Niranjan's 5 part lecture series on Machine Learning for our Summer School. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 30 external views in addition to being part of our Hybrid Machine Learning Summer School that provided training to over 100 students. 
URL https://eprints.soton.ac.uk/469869/
 
Title AI3SD Video: Machine Learning and AI for Drug Design 
Description Artificial Intelligence has become impactful during the last few years in chemistry and the life sciences, pushing the scientific boundaries forward as exemplified by the recent success of AlphaFold2. In this presentation I will provide an overview of how AI have impacted drug design in the last few years, where we are now and what progress we can reasonably expect in the coming years. The presentation will have a focus on deep learning based molecular de novo design, however, also aspects of synthesis prediction, molecular property predictions and chemistry automation will be covered. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 151 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/452735/
 
Title AI3SD Video: Machine Learning for Early Stage Drug Discovery 
Description Professor Charlotte Deane from the University of Oxford speaks about some of the work her research group have done on Machine Learning for Early Stage Drug Discovery to give a flavour of the different kinds of approaches they have been looking at. These run from predicting whether molecules will bind or not bind to a given protein target, to trying to remove biases from that kind of work, to finally how do we generate novel molecules in the protein binding sites. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 477 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447162/
 
Title AI3SD Video: Machine Learning for biological sequence design 
Description Prediction of protein functional properties from sequence is a central challenge that would allow us to discover new proteins with specific functionality. Experimental breakthroughs allow data on the relationship between sequence and function to be rapidly acquired that can be used to train and validate machine learning models that predict protein function directly from sequence. However, the cost and latency of wet-lab experiments require methods that find good sequences in few experimental rounds, where each round contains large batches of sequence designs. In this setting, I will discuss model-based optimization approaches that allow us to take advantage of sample inefficient methods and find diverse optimal sequence candidates for experimental evaluation. The potential of this approach is illustrated through the design and experimental validation of viable AAV capsid protein variants for gene therapy applications. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 634 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450084/
 
Title AI3SD Video: Machine Learning with Causality in Chemistry 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the seventh talk in the ML4MC series and formed part of the session "Research Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 96 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450672/
 
Title AI3SD Video: Machine Learning with Causality: Solubility Prediction in Organic Solvents and Water 
Description Solubility prediction remains a critical challenge in drug development, synthetic route and chemical process design, extraction and crystallisation. Here we report a successful approach to solubility prediction in organic solvents and water using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry. Rational interpretation of dissolution process into a numerical problem led to a small set of selected descriptors and subsequent predictions which are independent of the applied machine learning method. These models gave significantly more accurate predictions compared to benchmarked open-access and commercial tools, achieving accuracy close to the expected level of noise in training data (LogS ± 0.7). Finally, they reproduced physicochemical relationship between solubility and molecular properties in different solvents, which led to rational approaches to improve the accuracy of each models. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 106 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448771/
 
Title AI3SD Video: Machine learning applications for macro-molecular X-ray crystallography at Diamond 
Description Proteins are the core machinery in any living organism. Understanding their structure means understanding their function and the mechanism with which they carry out this function. In many diseases, the structure of a protein is altered through amino acid exchange usually as a result of mutations in the encoding DNA. The changes in the structure in turn alter the functions and mechanisms in proteins. Being able to understand these changes on an atomic level, offers the opportunity to design drugs 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 129 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450085/
 
Title AI3SD Video: Machine learning for electronically excited states of molecules 
Description Westermayr, Julia (2021) AI3SD Video: Machine learning for electronically excited states of molecules. Kanza, Samantha, Frey, Jeremy G., Niranjan, Mahesan and Hooper, Victoria (eds.) AI3SD Winter Seminar Series, , Online. 18 Nov 2020 - 21 Apr 2021 . (doi:10.5258/SOTON/P0080). Record type: Conference or Workshop Item (Other) Abstract An accurate simulation of the excited states of molecules can enable the study of many important processes that are fundamental to nature and the life forms we know, but these calculations are seriously limited by the high complexity and computational efforts involved. In this talk, I will discuss how machine learning algorithms can enable an efficient and accurate computation of photo-initiated reactions of molecules - from light excitation to nonradiative decay [1]. On the example of the methylenimmonium cation, I will introduce the SchNarc approach [2] and demonstrate the accuracy of its machine-learned potentials via UV/visible absorption spectra and nonadiabatic dynamics simulations [2,3]. Better statistics and long time-scale dynamics simulations become accessible with SchNarc, which would not be feasible without the help of ML [2-4]. [1] J. Westermayr, P. Marquetand, "Machine learning and excited-state molecular dynamics" Chem. Rev., in press, doi:10.1021/acs.chemrev.0c00749 (2020). [2] J. Westermayr, M. Gastegger, P. Marquetand, "Combining SchNet and SHARC: The SchNarc machine learning approach for Excited-State Dynamics", J. Phys. Chem. Lett. 11(10), 3828-3834 (2020). [3] J. Westermayr, P. Marquetand, "Deep learning for UV absorption spectra with SchNarc: First steps towards transferability in chemical compound space", accepted in J. Chem. Phys. (2020). [4] J. Westermayr, M. Gastegger, M. Menger, S. Mai, L. González, P. Marquetand, "Machine learning enables long time scale molecular photodynamics simulations", Chem. Sci. 10, 8100-8107 (2019). 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 259 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448777/
 
Title AI3SD Video: Machines Learning Chemistry 
Description Reinvigorated by algorithmic developments, faster hardware and large data sets, machine learning is pervading many aspects of chemistry. We present two examples from our recent studies, one in the area of drug discovery, the other focused on protein spectroscopy. In the first, we consider pharmaceutical lead discovery as active search in a space of labelled graphs [1]. We extend a recent data-driven adaptive Markov chain approach, and evaluate it on a focused drug design problem, where we search for an antagonist of an ?v integrin, the target protein that belongs to a group of Arg-Gly-Asp integrin receptors. In the second example, we present a novel machine learning protocol that uses a few key structural descriptors to predict amide I IR spectra of proteins and agrees well with experiment [2]. Its transferability enabled us to distinguish protein secondary structures, probe atomic structure variations with temperature, and monitor protein folding.[1] Oglic, D., Oatley, S.A., Macdonald, S.J.F., McInally, T., Garnett, R., Hirst, J.D. & Gärtner, T. Active search for computer-aided drug design. Mol. Inf., 2018, 37, 1700130. https://doi.org/10.1002/minf.201700130[2] Ye, S., Zhong, K., Zhang, J., Hu, W., Hirst, J.D., Zhang, G., Mukamel, S., Jiang, J. A transferable machine learning protocol for predicting protein amide-I infrared spectra. J. Am. Chem. Soc., 2020, 142, 19071. https://doi.org/10.1021/jacs.0c06530 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 209 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448982/
 
Title AI3SD Video: Making sense of highly flexible molecular simulations: Where AI can help and where not 
Description With simulating the dynamic behaviour of ever bigger molecular systems for longer simulation time we simultaneously achieve more realistic timescales and gain much better insights into the physiological relevant time dependent behaviour of molecular systems, but also generate significantly more data and thus pose new challenges for filtering noise and analysing the simulation data. In simulation analysis and data dimensionality reduction we often rely on linear dependencies and behaviour within the simulated timespan. This generally is true for systems that show slow structural or conformational transitions over the simulated timespan. For example, proteins and enzyme simulations that undergo large scale conformational changes can adequately be analysed by methods of principle component analysis (PCA) and analysing and visualising low frequency normal modes. However, much more flexible molecular systems undergoing multiple and seemingly chaotic conformational changes are posing challenges for their analysis. Here, we need to advance our analysis toolbox. Moreover, it is important to understand the flexibility and linear behaviour of the simulated system before choosing the analysis methods. In this talk we present three very differently behaving molecular systems including an enzymatic activation process, a flexible self-assembling host guest system, and a highly flexible dataset of lipid molecules relevant for antibiotic resistance of mycobacterium tuberculosis. We will show how chaos theory can help to understand the flexibility patterns of the simulations, then present classical PCA based simulation analysis, before introducing the opportunities and challenges for unsupervised machine learning methodologies. The presented methods will concentrate on competitive learning methods (self-organizing maps) and density based clustering algorithms (DB scan) to analysis dominating and hidden structural features in seemingly chaotic simulation data. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 66 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468638
 
Title AI3SD Video: Multiscale simulation of biomolecular mechanisms and dynamics: from enzyme evolution to receptor activation 
Description Simulations are revealing detailed mechanisms of biomolecular systems and functionally relevant dynamics, and contributing to enzyme design. Biomolecular simulations can be used as computational assays of biological activity, e.g. to predict drug resistance or the effects of mutation. Molecular simulation methods of various types are now capable of modelling processes ranging from biochemical reactions to membrane dynamics, and offer increasing predictive power. Recently, this has included identifying key features of SARS-CoV-2 proteins. Molecular dynamics (MD) simulations on long timescales can model substrate binding, and reveal dynamical changes associated with thermoadaptation and directed evolution of enzyme catalytic activity. MD simulations can calculate thermodynamic properties such as activation heat capacities. Increasingly, simulations are contributing to the design and engineering of natural enzymes and de novo biocatalysts. Interactive MD simulation in virtual reality allows direct manipulation of biological macromolecules, going beyond mere visualization to allow e.g. fully flexible docking of drugs into protein targets such as the SARS-CoV-2 main protease. Groups of researchers can work together in the same virtual environment. Mechanisms of signal transduction in receptors can be studied by a combination of equilibrium and nonequilibrium MD simulations, e.g. identifying a general mechanism of signal propagation in nicotinic acetylcholine receptors. Different types of application (e.g. ranging from chemical reactions to signal transduction) require different levels of treatment, which can be combined in multiscale models to tackle a range of time- and length-scales, e.g. to study drug metabolism by cytochrome P450 enzymes combining coarse-grained and atomistic MD and QM/MM methods. By coupling together different levels of description, multiscale methods can address e.g. how chemical changes in individual molecules cause changes at larger scales. QM/MM methods are an archetype of multiscale methods in biochemistry and can be used for modelling transition states and reaction intermediates, to identify catalytic interactions, and to analyse determinants of reactivity. QM/MM modelling can identify mechanisms of covalent inhibition and predict the activity of bacterial enzymes against antibiotics. References Evolution of dynamical networks enhances catalysis in a designer enzyme H.A. Bunzel et al. Nature Chemistry, in press (2021). https://www.biorxiv.org/content/10.1101/2020.08.21.260885v1 Designing better enzymes: Insights from directed evolutionâ HA Bunzel, JLR Anderson, AJ Mulholland Current Opinion in Structural Biology 67, 212-218 (2021) Allosteric communication in class A β-lactamases occurs via cooperative coupling of loop dynamics I. Galdadas et al. eLife 10:e66567 DOI: 10.7554/eLife.66567 (2021) Mechanism of covalent binding of ibrutinib to Brutonâ tyrosine kinase revealed by QM/MM calculations A Voice et al. Chemical Science https://doi.org/10.1039/D0SC06122K (2021) Interactive Molecular Dynamics in Virtual Reality Is an Effective Tool for Flexible Substrate and Inhibitor Docking to the SARS-CoV-2 Main Protease HM Deeks et al. Journal of Chemical Information and Modeling 60, 5803-5814 (2020) https://doi.org/10.1021/acs.jcim.0c01030 Molecular Simulations suggest Vitamins, Retinoids and Steroids as Ligands of the Free Fatty Acid Pocket of the SARS-CoV-2 Spike Protein D.K. Shoemark et al. 133, 7174-7186 (2021) Biomolecular Simulations in the Time of COVID-19, and After R.E. Amaro & A.J. Mulholland Computing in Science & Engineering 22, 30-36 (2020) DOI: 10.1109/MCSE.2020.3024155 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 161 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450596/
 
Title AI3SD Video: Neural Networks and Explanatory Opacity 
Description Deep artificial neural network (DANN) designers often accept that the systems they construct lack interpretability, are not transparent - in other words, that they are 'inexplicable'. It should not be obvious what they mean. Explanations, particularly in the neurosciences, are often thought to consist of the mechanisms which underpin observed phenomena. But DANN designers have complete access to the mechanisms underpinning the systems they build - as well as access to their training sets, design parameters, training algorithms and so on. In this talk I distinguish various senses of 'explanation' - ontic, epistemic, objective, subjective. The aims are (1) to help map out the various questions we might be interested in, (2) to scope the limits of mechanistic approaches to the question of explanation, and (3) to try to narrow down the sense in which DANNs are supposed to be explanatorily opaque. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 160 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/446698/
 
Title AI3SD Video: New Trends in Drug Discovery - Robotics & AI 
Description The drug discovery ecosystem is undergoing transformational change. New technologies have an increasing impact on how drugs are being discovered and developed, including in particular Artificial Intelligence and robotics. A critical precondition for successful development of AI applications is having access to large amounts of structured, annotated, machine-learnable data. The talk will cover how recent advances in laboratory automation and robotics enable the generation of biomedical data at scale, providing the most critical input for AI systems. The talk will present real-world case studies, showing how leading AI drug discovery firms benefit from full automation of their discovery processes, delegating experiment execution to advanced robotics, and allowing for projects to move to the next stage faster and more efficiently. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 158 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448782/
 
Title AI3SD Video: New theoretical and data-driven approaches to the study of molecular conformational spaces and energy landscapes 
Description In this talk I will introduce new geometric models for both molecules and molecular conformational spaces. I will show that these models allow us to define a symmetry group associated to a molecule that, in some sense, generalises the so-called complete nuclear permutation inversion group. I will also present the results of a systematic analysis on the conformational spaces of molecules and their energy landscapes using topological data analysis (TDA). These results will show that TDA provides the chemistry community with efficient methods to study the mathematical structures underlying the molecular conformational spaces and their energy landscapes. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 190 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/447579/
 
Title AI3SD Video: On the Basis of Brain: Neural Network Inspired Changes in General Purpose Chips 
Description Presenting the paper: On the Basis of Brain: Neural Network Inspired Changes in General Purpose Chips. In this paper, we disentangle the changes that the rise of Artificial Intelligence Technologies (AITs) is inducing in the semiconductor industry. The prevailing von Neumann architecture at the core of the established intensive technological trajectory of chip production is currently challenged by the rising difficulty to improve product performance over a growing set of computation tasks. In particular, the challenge is exacerbated by the increasing success of Artificial Neural Networks (ANNs) in application to a set of tasks barely tractable for classical programs. The inefficiency of the von Neumann architecture in the execution of ANN-based solutions opens room for competition and pushes for an adequate response from hardware producers in the form of exploration of new chip architectures and designs. Based on an historical overview of the industry and on collected data, we identify three characteristics of a chip â?? (i) computing power, (ii) heterogeneity of computation, and (iii) energy efficiency â?? as focal points of demand interest and simultaneously as directions of product improvement for the semiconductor industry players and consolidate them into a techno economic trilemma. Pooling together the trilemma and an analysis of the economic forces at work, we construct a simple model formalising the mechanism of demand distribution in the semiconductor industry, stressing in particular the role of its supporting services, the software domain. We conclude deriving two possible scenarios for chip evolution: (i) the emergence of a new dominant design in the form of a â??platform chipâ? comprising heterogeneous cores; (ii) the fragmentation of the semiconductor industry into submarkets with dedicated chips. The convergence toward one of the proposed scenarios is conditional on (i) technological progress along the trilemma's edges, (ii) advances in the software domain and its compatibility with hardware, (iii) the amount of tasks successfully addressed by this software, (iv) market structure and dynamics. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 169 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447161/
 
Title AI3SD Video: One does not simply "digitise scientific research": The challenges and opportunities of technology in the 21st century 
Description We live in a technology driven era where emails, electronic systems and smart assistants are commonplace, and yet despite this there is an abnormally large amount of scientific research that is still recorded on paper. Additionally, even when data and research is captured electronically, it is of limited use unless it is adequately stored, labelled and made available in a machine-readable format. This talk explores some of the challenges and opportunities of digitising scientific research in the 21st century. We will also discuss the affordances of the semantic web, demonstrating where it can be used across the entire scientific research process; noting some lessons learned, and providing some recommendations for going forward in the future. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact This was a CAPTURE talk delivered by Dr Samantha Kanza and Dr Nicola Knight, also published on the AI4SD YouTube Channel for the Network audiences. 
URL http://eprints.soton.ac.uk/id/eprint/451481
 
Title AI3SD Video: Ontologies, Natural Language, Annotation and Chemistry 
Description Ontologies have a wide range of uses beyond the semantic web. If you have had a productive conversation with an intelligent assistant such as Siri or Alexa, then you have used an ontology, perhaps without realising it. In this talk I talk about some of the varied ways ontologies are used in practice and what we've done at the Royal Society of Chemistry. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 69 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448775/
 
Title AI3SD Video: Open Access Data: A Cornerstone for Artificial Intelligence Approaches to Protein Structure Prediction 
Description The Protein Data Bank (PDB) was established in 1971 to archive three-dimensional (3D) structures of biological macromolecules as a public good. Fifty years later, the PDB is providing millions of data consumers around the world with open access to more than 175,000 experimentally determined structures of proteins and nucleic acids (DNA, RNA) and their complexes with one another and small-molecule ligands. PDB data users are working, teaching, and learning in fundamental biology, biomedicine, bioengineering, biotechnology, and energy sciences. They also represent the fields of agriculture, chemistry, physics and materials science, mathematics, statistics, computer science, and zoology, and even the social sciences. The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and machine learning methods. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 82 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450176/
 
Title AI3SD Video: Organising your Networks & Projects 
Description This talk will cover tips and tricks for how to organise your networks and projects, including how to set up your communication methods, how to structure and organise all of the different types of data that you collect and best practices for project management. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 29 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/457081/
 
Title AI3SD Video: Outlier Detection in Scientific Data 
Description A range of outlier detection methods are discussed and tested on synthetic data. The application of these methods to real world scientific data is discussed. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 61 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468635
 
Title AI3SD Video: Outlier detection in Scientific Discovery 
Description Detecting anomalous readings in data is a problem. Humans are good at some types, for example with images, however machines find it rather more difficult. Detecting anomalies in time series data is even more tricky. Discriminating between data that is part of the same distribution, or caused by some other process is also nontrivial. Anomaly detection is used in a wide range of applications, for example fraud detection for bank accounts, condition monitoring of mechanical systems, and in medical imagery. In all these applications, an outlier is indicative of a problem that requires further attention. A range of outlier detection methods is presented, and tested on a range of synthetic multivariate time series data. A novel method, cyclic regression, is presented and compared to more traditional methods. The application of these methods to real world data is demonstrated. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 124 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/447701/
 
Title AI3SD Video: Pitfalls and Gotcha's with bioactivity data 
Description John's talk focused on his experiences working with bioactivity data and drug discovery research along with some of the problems and errors that people have encountered when working in this sphere. In the past researchers could reasonably know 'most' of the research within a field, but now we have much larger scale research, more participants and more data but without a lot of the groundwork being laid for good data sharing and reusability. Now there is a lot of messy data out there; inaccessible data, cryptic data and poorly described data. John talks about some of the bioactivity resources that are available for researchers and some of the successes that these data sources have had when handling large amounts of data. John gives plenty of tips about things to look out for when examining chemical and biological data, with plenty of examples. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 131 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447530/
 
Title AI3SD Video: Practical Ethics for Data Science and Algorithm Design 
Description 'Ethics' is widely considered to be fuzzy, vague and an oft moving target. In this talk we will not discuss the dead white men who have debated the nature of ethics down the centuries (hereâ??s looking at you, Kant). We will instead take the lack of consensus as given and explore instead what concrete measures may be pursued as valid proxies for ethical behavior in the fields of data science, machine learning and AI. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 108 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL http://eprints.soton.ac.uk/id/eprint/447092
 
Title AI3SD Video: Prediction in organometallic catalysis - a challenge for computational chemistry 
Description Computational results are now routinely used to contribute to the interpretation of experimental data, including for the confirmation of mechanistic postulates, but their contribution to substantial predictions made before experiments remains the exception [1], at least in the area of organometallic catalysis. More effective use of what we know about chemical reactions, regardless of whether the information was generated from experiment or calculation, will clearly play a role in moving towards this kind of ab initio prediction in this field. Here the adoption of statistics and data science into the chemical sciences are proving crucial and we have built large databases of parameters characterising ligand and complex properties in a range of different environments [2-6]. In this session, I will use examples drawn from our recent work, including the early stages of our development of a reactivity database, to illustrate this approach and discuss why organometallic catalysis is such a challenging yet rewarding area for prediction. Website: https://feygroupchem.wordpress.com/ References: 1. J. Jover, N. Fey, Chem. Asian J., 9 (2014), 1714-1723; D. J. Durand, N. Fey, Chem. Rev., 119 (2019), 6561-6594. 2. A. Lai, J. Clifton, P. L. Diaconescu, N. Fey, Chem. Commun., 55 (2019), 7021-7024. 3. O. J. S. Pickup, I. Khazal, E. J. Smith, A. C. Whitwood, J. M. Lynam, K. Bolaky, T. C. King, B. W. Rawe, N. Fey, Organometallics, 33 (2014), 1751-1791. 4. J. Jover, N. Fey, J. N. Harvey, G. C. Lloyd-Jones, A. G. Orpen, G. J. J. Owen-Smith, P. Murray, D. R. J. Hose, R. Osborne, M. Purdie, Organometallics, 29 (2010), 6245-6258. 5. J. Jover, N. Fey, J. N. Harvey, G. C. Lloyd-Jones, A. G. Orpen, G. J. J. Owen-Smith, P. Murray, D. R. J. Hose, R. Osborne, M. Purdie, Organometallics, 31 (2012), 5302-5306. 6. A. I. Green, C. P. Tinworth, S. Warriner, A. Nelson, N. Fey, Chem. Eur. J. 2020, Accepted Article, DOI: 10.1002/chem.202003801. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 224 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448981/
 
Title AI3SD Video: Predictive Retrosynthesis in SciFindern 
Description The Retrosynthetic Planner in uncovers and prioritizes synthetic pathways to known and novel targets leveraging sophisticated algorithms mining the CAS REGISTRY® database. Each transformation is supported by literature evidence and possible alternatives steps are provided for review and substitution if desired. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 107 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468634
 
Title AI3SD Video: Presenting in Person & Online 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the second talk in the Skills4Scientists #5 - Posters, Presentations & Reports which focussed on several areas of communication for your research; presentations, posters and reports. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 19 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450844/
 
Title AI3SD Video: Preserving Structural Motifs in Machine-Learning Approaches to Modeling Water Clusters 
Description Chemical structures are naturally viewed as collections of atoms connected through bonds, and graph theory provides a natural tool for capturing that intuition in a concrete mathematical fashion. Over the past several years, graph neural networks have become increasingly popular for modeling chemical systems. To build on this work, the multi-laboratory ExaLearn project, part of the DOE Exascale Computing Project, is developing novel capabilities that combine state-of-the-art machine-learning techniques with high-performance computing to enable the rapid exploration of chemical space on exascale-class systems. Water clusters offer an interesting use case for the development of machine-learning approaches that preserve intermolecular interactions and structural motifs. We apply a dataset of ~5 million hydrogen-bonded water clusters that display interesting long-range structural patterns to explore unique challenges in property prediction and molecular generation. [1] J. A. Bilbrey, J. P. Heindel, M. Schram, P. Bandyopadyay, S. S. Xantheas, S. Choudhury. "A look inside the black box: Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters," J. Chem. Phys., 2020, 153, 024302. [2] S. Choudhury, J. A. Bilbrey, L. Ward, S. S. Xantheas, I. Foster, J. P. Heindel, B. Blaiszik, M. E. Schwarting. "HydroNet: Benchmark Tasks for Preserving Long-range Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data," Machine Learning and the Physical Sciences workshop at NeurIPS, 2020. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 317 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448776/
 
Title AI3SD Video: Producing Useful Code 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #4 - Intro to Python 2 Session, which was a follow on from our Intro to Python 1 course, with a focus on working further with the core elements of Python and performing data analysis, using Jupyter notebooks and Anaconda. This course is designed to allow you to follow along with the content and examples as the course goes, but you will also be provided with course material to allow you to cover it again after the live event. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 88 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450615/
 
Title AI3SD Video: Producing a good Poster 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the third talk in the Skills4Scientists #5 - Posters, Presentations & Reports Session, which focussed on several areas of communication for your research; presentations, posters and reports. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 44 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450845/
 
Title AI3SD Video: Protein-Ligand Structure Prediction for GPCR Drug Design 
Description From GPCR Structure Prediction to Structural GPCR-Ligand Interaction Predictionâ?? The conserved TM helical fold of G Protein-Coupled Receptors (GPCRs) and progress in GPCR structural biology continues to provide homology modeling templates for protein structure prediction. - Novel structures of GPCR-ligand complexes solved at Sosei Heptares and elsewhere continue to reveal a diversity of protein-ligand binding sites and binding modes that are challenging to predict.Appreciating the Devilâ??s in the Details of Structure-Based GPCR Drug Design - Novel structural insights into the GPCRome can be complemented by pharmacological, biophysical, and computational studies and data to identify and predict structural determinants of ligand-receptor binding and selectivity. - Orthogonal physics-based (Molecular Dynamics, e.g. Free Energy Perturbation FEP+, WaterMap from Schrödinger) and empirical (e.g. GRID and WaterFLAP from Molecular Discovery) structure-based drug design methods to target lipophilic hotspots and modulate water networks across GPCR families.Chemogenomic View to Navigate Structural GPCR-Ligand Interaction Space - Integrated GPCR-ligand chemogenomics views that combine structural, pharmacological, and chemical data allow the exploration of receptor-ligand interaction space for structure-based GPCR drug design. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 647 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450197/
 
Title AI3SD Video: Publishing and Citing Data in Practice 
Description Sharing data, whether openly or on a more restricted basis, is increasingly expected of researchers in many areas. This is incredibly valuable for the scientific community but does take more work than simply putting the results in a drawer after the paper is published, so how can we make sure that the originators of data get full credit for their labour while ensuring that the data continues to be accessible in the long term? Data citation has a big role to play in the answer to this question, and this talk will give you an overview of the principles of data citation and how to implement them in practice. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 611 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/447369/
 
Title AI3SD Video: Quantifying crystal similarity 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the eleventh talk in the ML4MC series and formed part of the session "Mentor Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 47 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/450848/
 
Title AI3SD Video: Quantum Machine Learning 
Description Many of the most relevant observables of matter depend explicitly on atomistic and electronic details, rendering a first principles approach to computational materials design mandatory. Alas, even when using high-performance computers, brute force high-throughput screening of material candidates is beyond any capacity for all but the simplest systems and properties due to the combinatorial nature of compound space, i.e. all the possible combinations of compositional and structural degrees of freedom. Consequently, efficient exploration algorithms exploit implicit redundancies and correlations. I will discuss recently developed statistical learning based approaches for interpolating quantum mechanical observables throughout compound space. Numerical results indicate promising performance in terms of efficiency, accuracy, scalability and transferability. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 96 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453336/
 
Title AI3SD Video: RSC CICAG "Who we are, what we do and what we are planning" 
Description The Royal Society of Chemistry Chemical Information and Computer Applications Group (CICAG) is one the the RSC's member led interest groups. The storage, retrieval, analysis and preservation of chemical information and data are of critical importance for research, development and education in the chemical sciences. CICAG works to support users of chemical information by providing training workshops, conferences highlighting the latest research in the area, and to promote wider recognition of the importance of chemical information via the newsletter. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 30 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468633
 
Title AI3SD Video: Referencing & Using Reference Managers 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #1 - Research Data Management Session, which focussed on several areas of good data management practices. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 65 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450267/
 
Title AI3SD Video: Reinforcement Learning Methods 
Description Reinforcement learning is a machine learning paradigm in which an agent learns to make decisions to achieve a long-term goal. In the past five years, the previously somewhat niche method has seen substantially increased interest from within the chemistry community, driven by the need for a machine learning approach to problems of planning and sequential decision making and recent developments in harnessing the power of neural networks to make reinforcement learning achievable for large problems. This talk will introduce the theory that underpins reinforcement learning, review its applications in chemistry to date with a particular focus on the field of drug discovery and molecular design, and consider the future of this still-developing approach. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 57 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468632
 
Title AI3SD Video: Semantic Web in Scientific Research - Possibilities & Practices 
Description The use of semantic web technologies within scientific research has slowly gained momentum over the last twenty years. Researchers have realised that these technologies are key to dealing with large volumes of data, and that they enable better organisation of scientific documents and practices. However, we still have a long way to go before these technologies realise their true potential, and this is as much a human endeavour as a technological one. This talk will discuss the affordances of the semantic web, demonstrating where it can be used across the entire scientific research process; but it will also note some lessons learned throughout the last twenty years, and provide some recommendations for going forward in the future. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 75 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/447788/
 
Title AI3SD Video: Setup, environments, installing packages, intro to Jupyter 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 122 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL http://eprints.soton.ac.uk/id/eprint/450248
 
Title AI3SD Video: Sharing Data Science Solutions Across Domains via Patterns 
Description We are generating more data than we ever have before and are developing new and exciting ways to derive new insights from them. Sharing our research is a fundamental part of expanding humanity's knowledge, as well as ensuring that researchers get credit for their work. In this talk I will introduce Patterns, the data science journal from Cell Press, along with wider discussions about the current state of AI and data science, and how to ensure that what we learn is shared and communicated effectively. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 37 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468631
 
Title AI3SD Video: Simulation of chemical dynamics and spectroscopy with deep learning representations of electronic structure 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the twelth talk in the ML4MC series and formed part of the session "Mentor Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 68 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/451151/
 
Title AI3SD Video: Skills4Scientists - Poster & Careers Symposium - Poster Compilation 
Description This video forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video is a compilation of posters presented at the Skills4Scientists Posters & Careers Symposium. These poster presentations are predominantly from summer students involved in the AI3SD 2021 summer internship program. Higher resolution versions of the posters are available on the poster symposium website: https://www.ai3sd.org/s4s-symposium20...Not all poster presenters requested a recording of their talk. The following posters recordings are included in this compilation video. Poster 1 - Nearer the nearsightedness principle: Large-scale quantum chemical calculations - Andras Vekassy (University of Southampton) Poster 3 - Combining Ultrasonic Methods and Machine Learning Techniques to Assess Baked Products Quality - Erhan Gulsen (University of Nottingham) Poster 4 - Interactive Knowledge-Based Solvent Selection Tool - Hewan Zewdu (University of Nottingham)Poster 5 - CV in High Throughput Chemistry - Jamie Longino (University of Strathclyde)Poster 9 - Dewetting in Thin Liquid Films: Using Sparse Optimization to Learn Evolution Equations - Aspen Fenzl (University of Sheffield)Poster 12 - Creating a merged dataset and its exploration with different Machine Learning algorithms - Maximilian Hoffman (Freie Universität of Berlin)Poster 14 - Bayesian optimisation in Chemistry - Rubaiyat Khondaker (University of Cambridge) Poster 15 - A deep neural network for generation of functional organic materials - Rhyan Barrett (University of Warwick) Thank you to our sponsors Optibrium (https://www.optibrium.com/) and Dotmatics (https://www.dotmatics.com/) who supported this event. These poster presentations were live cartooned by ErrantScience (errantscience.com) which is also available on our YouTube Channel. Sections Intro: (0:00) Andras Vekassy - Nearer the nearsighted principle: Large-scale quantum chemical calculations: (0:17) Erhan Gulsen - Combining Ultrasonic Methods and Machine Learning Techniques to Assess Baked Products Quality: (06:11) Hewan Zewdu - Interactive Knowledge-Based Solvent Selection Tool: (12:09) Jamie Longino - CV in High Throughput Chemistry: (16:52) Aspen Fenzl - Dewetting in Thin Liquid Films: Using Sparse Optimization to Learn Evolution Equations: (21:21) Maximillian Hoffman - Creating a merged dataset and its exploration with different machine learning algorithms: (27:31) Rubaiyat Khondaker - Bayesian optimisation in Chemistry: (34:33) Rhyan Barrett - A deep neural network for generation of functional organic materials: (40:06) Further details from this series can be found at: https://www.ai3sd.org/skills4scientists 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 34 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/451464/
 
Title AI3SD Video: Skills4Scientists - Poster & Careers Symposium - Poster Compilation 
Description This video forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video is a compilation of posters presented at the Skills4Scientists Posters & Careers Symposium. These poster presentations are predominantly from summer students involved in the AI3SD 2021 summer internship program.Higher resolution versions of the posters are available on the poster symposium website: https://www.ai3sd.org/s4s-symposium20...Not all poster presenters requested a recording of their talk. The following posters recordings are included in this compilation video. Poster 1 - Nearer the nearsightedness principle: Large-scale quantum chemical calculations â?? Andras Vekassy (University of Southampton)Poster 3 - Combining Ultrasonic Methods and Machine Learning Techniques to Assess Baked Products Quality â?? Erhan Gulsen (University of Nottingham)Poster 4 - Interactive Knowledge-Based Solvent Selection Tool â?? Hewan Zewdu (University of Nottingham)Poster 5 - CV in High Throughput Chemistry â?? Jamie Longino (University of Strathclyde)Poster 9 - Dewetting in Thin Liquid Films: Using Sparse Optimization to Learn Evolution Equations â?? Aspen Fenzl (University of Sheffield)Poster 12 - Creating a merged dataset and its exploration with different Machine Learning algorithms â?? Maximilian Hoffman (Freie Universität of Berlin)Poster 14 - Bayesian optimisation in Chemistry â?? Rubaiyat Khondaker (University of Cambridge)Poster 15 - A deep neural network for generation of functional organic materials â?? Rhyan Barrett (University of Warwick)Thank you to our sponsors Optibrium (https://www.optibrium.com/) and Dotmatics (https://www.dotmatics.com/) who supported this event. These poster presentations were live cartooned by ErrantScience (errantscience.com) which is also available on our YouTube Channel. SectionsIntro: (0:00)Andras Vekassy - Nearer the nearsighted principle: Large-scale quantum chemical calculations: (0:17)Erhan Gulsen - Combining Ultrasonic Methods and Machine Learning Techniques to Assess Baked Products Quality: (06:11)Hewan Zewdu - Interactive Knowledge-Based Solvent Selection Tool: (12:09)Jamie Longino - CV in High Throughput Chemistry: (16:52)Aspen Fenzl - Dewetting in Thin Liquid Films: Using Sparse Optimization to Learn Evolution Equations: (21:21)Maximillian Hoffman - Creating a merged dataset and its exploration with different machine learning algorithms: (27:31)Rubaiyat Khondaker - Bayesian optimisation in Chemistry: (34:33)Rhyan Barrett - A deep neural network for generation of functional organic materials: (40:06)Further details from this series can be found at: https://www.ai3sd.org/skills4scientistsThis video is an output from the AI3SD Network+ (Artificial Intelligence and Augmented Intelligence for Automated Investigations for Scientific Discovery) which is funded by EPSRC under Grant Number EP/S000356/1 and PSDS (Physical Sciences Data science Service) which is funded by EPSRC under Grant Number EP/S020357/1. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 64 external views (and 124 downloads from Pure) in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/451464/
 
Title AI3SD Video: Smart Cleaning & COVID-19 
Description Industrial Digital Technologies (IDTs) such as robotics, AI and IoT are transforming manufacturing worldwide with significant productivity, efficiency and environmental sustainability benefits. This digital revolution is often labelled Industry 4.0 and at its heart is the enhanced collection and use of data. The food and drink sector has been slow to adopt IDTs for a variety of reasons including the availability of cost effective sensing technologies, capable of operating in production environments. This presentation will discuss the use of IDTs within the important task of food factory cleaning. It will cover the benefits and challenges of deploying robots, sensors and machine learning technologies for factory cleaning tasks in addition to the ever growing importance of effective factory cleaning during a global pandemic. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 69 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447163/
 
Title AI3SD Video: So you predicted a protein structure - What now? 
Description Recent advances in technologies like cryoEM structure resolution and protein de novo folding prediction have resulted in a wealth of macromolecular structures that have not been resolved to the level of detail a high-resolution X-ray crystal structure could provide. Taking full advantage of these structures for rational drug design would benefit from additional validation and refinement. In this presentation, we investigate if computational refinement and structure-based modeling methods can be utilized to generate reliable complex poses. We present a solution to the induced fit docking problem for protein-ligand binding by combining ligand-based pharmacophore docking, rigid receptor docking, and protein structure prediction with explicit solvent molecular dynamics simulations. This methodology succeeded in determining protein-ligand binding modes with a root-mean-square deviation within 2.5 Ã compared to experiment in over 90% of cross-docking cases in our testing. Applications of the predicted ligand-receptor structure in free energy perturbation calculations for additional validation is demonstrated. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 87 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450208/
 
Title AI3SD Video: Statistics Are a Girl's best Friend: Expanding the mechanistic Study Toolbox with Data Science 
Description The value of amassing and standardizing chemical data for improving the efficiency of chemical discovery is becoming increasingly clear. Machine learning analyses of these data are focused on finding correlations, trends and patterns to uncover needles of knowledge in the haystack of chemical reactions. However, in many cases, especially in academic settings, we do not have the means to produce large data sets, so by necessity we remain in the Small Data regime. In this talk, I will present our work in the field of organocatalysis focused on applying machine learning strategies to small data sets as a means to uncover underlying mechanisms. We aim to show that whereas Big Data serves to identify hidden correlations, Small Data encourages the discovery of causation. In this sense, Small Data is not just a necessity, but is key to bridging the gap between human intuition and machine learning. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 167 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/452734/
 
Title AI3SD Video: Supramolecular Antimicrobials - the next target for AI/Machine Learning? 
Description Since the 1980s the development of novel antibiotics has dramatically reduced. This, combined with the ever-increasing prevalence of antibiotic resistance in bacteria, means that some bacterial strains have now been identified that are resistant to treatment with all known classes of antibiotic currently available. Supramolecular Self-associating Amphiphiles (SSAs) are a novel class of amphiphilic salts that contain an uneven number of covalently linked hydrogen bond donating and accepting groups, meaning that they are frustrated in nature. The hydrogen-bonded, self-associative properties for members of this class of over 70 compounds synthesised to date have been extensively studied in the gas phase, solution state, solid state and in silico. Through these studies we have shown correlations between certain physicochemical properties that maybe predicted by simple, low-level, high-throughput, easily accessible computational modelling. In addition, members from this class of compound have been shown to kill a variety of different bacteria, including those with known antibiotic resistance (e.g. Methicillin Resistant Staphylococcus aureus (MRSA)). These initial studies have highlighted within the supramolecular chemistry community a vast amount of experimental data, not yet accessed by AI/machine learning. Could data sets such as these be the next targets of interest for this community? Is there room for a consortium or community led approach to solving predictive modelling within this branch of chemistry. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 103 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447160/
 
Title AI3SD Video: The "almost druggable" genome 
Description This talk will briefly introduce the "Illuminating the Druggable Genome" knowledge management center, with focus on its protein-centric data aggregator, Pharos (https://pharos.nih.gov/), and the DrugCentral online pharmaceutical compendium (https://drugcentral.org/). Using Pharos/DrugCentral data, we then examine the question, "what proteins that could potentially be ligandable, are currently not?", in a disease context. To do this, we examine proteins available in the RSCB PDB (https://www.rcsb.org/) â?? the "PDB-ome" => 347 proteins that lack known ligands; Proteins for which chemical matter is known, N=2644 - the "SAR-ome" => of these, 115 proteins meet the "ligandable" criteria; the "Pocket-ome", i.e., proteins that have a close - by sequence identity - homologue with known 3D structure, which leads to ~700 ligandable proteins with PDB structures; 180 that have close homologues but lack 3D structures; and N=2623 proteins that could be modeled with reasonable confidence; last but not least, the "Phen-ome", which looks at this entire list (N = 6742) from the perspective of rare and common diseases, GWAS and mouse phenotype data, etc, and narrows down the previous lists. The "almost druggable genome" contains 715 ligandable (3D exists) proteins, 180 proteins for which chemical matter is likely to be found, and at least 100 proteins that could be subject to chemical probe optimization. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 168 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450167/
 
Title AI3SD Video: The Application of Machine Learning in Molecular Spectroscopy Study 
Description Optical-spectroscopy provides powerful toolkits to decipher molecular structures and their configuration evolutions. However, the theoretical analysis of spectroscopic signals and connecting them with structural detail is a challenging task. Moreover, the intrinsic complexity of spectroscopic signals of molecular systems makes it difficult to correlate spectral characteristics with the underlying molecular structure and dynamics. Herein, we have developed data-driven machine learning (ML) protocols that can predict infrared (IR), ultraviolet/visible (UV/Vis) and Raman spectra of molecule systems with 3 to 5 orders of magnitude reduced computation cost compared to direct quantum chemistry calculations. A convolutional neural network (CNN) model was trained and tested on a dataset consisting 87993 spectra computed from protein peptide segments with α-helical, β-sheet, and other typical secondary structures. The secondary structure classification accuracy reached near 100% and over 98.7% on spectra sets of new segments extracted from the same and homologous proteins, respectively. Importantly, we demonstrate the ML protocol to realize cost-effective relations between spectra, structure, and chemical properties, i.e. spectra determination/prediction from structural information, and configuration or chemical properties determination/recognition from spectroscopic signals.1. S. Ye, K. Zhong, J.X. Zhang, W. Hu, J. Hirst, G.Z. Zhang, S. Mukamel, J. Jiang*, A Machine Learning Protocol for Predicting Protein Infrared Spectra, J. Am. Chem. Soc. 142 (2020) 19071-19077.2. X.J. Wang, S. Ye, W. Hu, E. Sharman, R. Liu, Y. Liu, Y. Luo, J. Jiang*, Electric Dipole Descriptor for Machine Learning Prediction of Catalyst Surface-Molecular Adsorbate Interactions, J. Am. Chem. Soc. 142 (2020) 7737-7743.3. S. Ye, W. Hu, X. Li, J.X. Zhang, K. Zhong, G.Z. Zhang, Y. Luo, S. Mukamel*, J. Jiang*, A Neural Network Protocol for Electronic excitations of N-Methylacetamide, Proc Natl Acad Sci USA. 116 (2019) 11612-11617.4. W. Hu, S. Ye, Y.J Zhang, T.D. Li, G.Z. Zhang, Y. Luo, S. Mukamel, J. Jiang*, Machine Learning Protocol for Surface-Enhanced Raman Spectroscopy, J. Phys. Chem. Lett. 10 (2019) 6026-6031. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Has received 1 download and 135 hits on Pure in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL http://eprints.soton.ac.uk/id/eprint/450086
 
Title AI3SD Video: The Bluffers Guide to Symbolic AI 
Description Symbolic AI, sometimes referred to as Good Old-fashioned AI, has its roots in the earliest days of the AI project. It seeks to represent reasoning using explicit data structures often drawn from logic. Symbolic AI systems have the advantage of being comparatively easy to understand and analyse and potentially allow compact forms of representation and communication. Their disadvantages tend to include inflexibility, a high knowledge engineering cost, and difficulty handling non-symbolic, statistical and analogue processes such as vision and motion. This talk will cover a brief history of the field and current topics within it as well as looking at proposals for combining symbolic and non-symbolic reasoning. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 102 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 700 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/447164/
 
Title AI3SD Video: The Crystal Isometry Principle 
Description The strongest and most practical equivalence of periodic crystals is rigid motion or isometry preserving all inter-atomic distances. The Crystal Isometry Principle (CRISP) says [1] that all real non-equivalent crystals should have non-isometric structures of atomic centres without chemical labels. If one atom is replaced by another one, distances to neighbouring atoms are inevitably perturbed, which can be detected by recent geometric invariants independent of any thresholds. More than 200 million pairwise comparisons of all periodic crystals with full geometric data from the Cambridge Structural Database (CSD) over two days on a modest desktop found five pairs of suspicious entries with different compositions but identical geometries [1, section 7]. For instance, all geometric parameters of HIFCAB and JEPLIA are identical to the last decimal place, but one atom of Cadmium is replaced by Manganese. With the help of the Cambridge Crystallographic Data Centre, all journals that published the underlying papers started investigations into data integrity. These experiments confirm that all periodic crystals (without restricting them to any chemical composition) live in a common Crystal Isometry Space (CRISP) parameterised by complete invariants. For example, diamond and graphite consisting of identical carbon atoms occupy in this CRISP space different positions given by unique geographic-style coordinates and a well-defined distance. In the same way, Mendeleev put all chemical elements (despite their obvious differences) into a single periodic table parameterised by two discrete coordinates: the period and group number. The new invariant coordinates extend Mendeleev's table to the continuous space CRISP containing all existing and not yet discovered periodic crystals. [1] Widdowson, Mosca, Pulido, Kurlin, Cooper. Average Minimum Distances. MATCH Commun. Math. Comput. Chem. 87, 529-559 (2022), kurlin.org/projects/periodic-geometry-topology/AMD.pdf 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 46 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/468630
 
Title AI3SD Video: The Shape of Data in Chemistry - Insights Gleaned from Complex Solutions and Their Interfaces 
Description Highly non-ideal solutions are ever-present within chemistry, physics, and materials science - and are characterized by many-body effects across length and timescale. Understanding, and predicting, many-body correlations in the condensed phase is a grand challenge for the modeling and simulation community. Yet within the data science community, a large suite of tools exist for elucidating complex, correlating, relationships amongst variables. Molecular modeling and simulation data is in fact well-suited for study by methods that include the topology of graphs, point cloud data, and recent advances in applied mathematics methods that investigate surfaces like sublevel set persistent homology and geometric measure theory. We adapt, develop, and apply these tools to study highly non-ideal solutions and their interfaces, with examples drawn from separations science. The new physical insight derived from these methods is paving the way for bespoke liquid/liquid interfaces that optimize transport characteristics for purification and synthesis. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 94 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448783/
 
Title AI3SD Video: The Universal Digital Twin - accessing the world of chemistry 
Description In my talk I shall present the â??universal digital twinâ? (UDT) and some of its applications in the realm of chemistry. The UDT is a dynamic knowledge graph and is implemented using technologies from the Semantic Web. It is composed of concepts and instances that are defined using ontologies, and of computational agents that operate on both the concepts and instances to update the dynamic knowledge graph. By construction, it is distributed, supports cross-domain interoperability, and ensures that data is connected, portable, discoverable and queryable via a uniform interface. We present a small number of use cases that demonstrate the ability of the dynamic knowledge graph to host and query chemical knowledge, control chemistry experiments and combine it with geospatial data. For example, we shall present Marie, which is a proof-of-concept Question Answering system for accessing chemical data in the UDT. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 150 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453351/
 
Title AI3SD Video: The Variational Quantum Eigensolver - progress and near term applications for quantum chemistry 
Description The Variational Quantum Eigensolver is among the most promising near term applications for quantum computing. It offers the possibility to model some wave functions accurately in polynomial time. Despite this, many hurdles and open questions remain. We will go through these questions, try discussing possible answers and the direction of research. After this we will discuss recent applications of the methods and integration to quantum chemistry methods such as CASSCF and experimentation on quantum computers. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 111 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453335/
 
Title AI3SD Video: Topology: From shapes to numbers 
Description Topology is a branch of mathematics that concerns itself with a study of shape and how to describe them through computable numerical characteristics. While this sounds inauspicious, in the last two decades we witnessed a very rapid development of Topological Data Analysis (TDA), which has now established itself as a key part of modern data-driven science alongside machine learning and statistics. There are now very many examples of how TDA can be used to derive information from complex data sets and to provide insight into complex scientific problems. In this talk, we will introduce the main tools of TDA and to illustrate their applications to the problem of solubility of chemical compounds. The talk will also serve as an introduction to the other talks in this session. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 281 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/447582/
 
Title AI3SD Video: Towards Biological Plausibility Using Linked Open Data 
Description Behind risk assessment is experimental evidence. Behind biological knowledge is primary literature. However, because the amount of knowledge keeps growing, our experimental technologies are advancing and getting increasingly complex, even experts can no longer keep up with the progress in mechanistic understanding, outside their increasingly specialistic domain. At the same time, the number of biological questions with a simple answer keeps dropping and many modern questions have complex answers. Access to the right facts at the right time needs a change of thinking. The idea of linking facts and data at a large scale was envisioned long ago, but only recently became viable, with the introduction of the semantic web and linked open data. These new technologies make it possible to easily link remote knowledge, taking advantage of globally unique identifiers and exact meaning with ontologies [1,2]. This presentation outlines how we applied these ideas to the life sciences in general and with applications to toxicology. Using eNanoMapper [3], WikiPathways [4], and Wikidata [5], it will show how semantic web approaches can be used to answer questions that are much harder to answer with older approaches. Examples will show 1. how we can use SPARQL to return all assay experiments for all types of metal oxides, 2. how biological pathway knowledge can be combined with knowledge from chemical databases, and 3. how we can find research about and scholars that study particular genes, proteins, or toxicants.References:Samwald, M.et al. Linked open drug data for pharmaceutical research and development. Journal of Cheminformatics 3, 19 (2011)Willighagen, E.L. et al. The ChEMBL database as linked open data. Journal of Cheminformatics 5, 23 (2013)Hastings, J. et al. eNanoMapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. Journal of Biomedical Semantics 6, (2015)Waagmeester, A. et al. Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources. PLOS Comp Biology 12, e1004989 (2016)Waagmeester, A. et al. Wikidata as a knowledge graph for the life sciences. eLife 9, e52614 (2020). 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 132 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/451924/
 
Title AI3SD Video: Translating innovations out of the lab and into the clinic: the importance of data curation, AI and ML? 
Description Our novel patented (European Patent Application No. 18743767.8, U.S. Patent Application No. 16/632,194), Supramolecular Self-associating Amphiphile (SSA) platform technology currently contains a library of ˜ 120 molecules (Figure 1), invented by J. Hiscock in 2016, has since been developed by an international, transdisciplinary team of ˜50 academic/industrial/governmental scientists, social scientists and clinicians. To date this molecular technology has been shown to: 1.act as broad-spectrum antimicrobials;1-6 2. increase the efficacy of other antibiotic/antiseptic agents and anticancer agents against bacteria7 and ovarian cancer cells respectively;8 3. selectively interact with phospholipid membranes of different compositions;9,10 4. have the potential to act as drug delivery vehicles;11 5. exhibits a drugable profile when administered by i.v. in vivo (unpublished data); 6. and enable the production of novel flow battery electrolytes.12 However, this means the we not only create a lot of data, but that these multiple outputs exist in multiple forms, come from all over the world and are underutilised when designing the next project steps. Here we will attempt to introduce you to our approach in solving these problems. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 21 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/468628/
 
Title AI3SD Video: Typing, Variables, Data Types & Functions 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the second talk in the Skills4Scientists #4 - Intro to Python 2 Session, which was a follow on from our Intro to Python 1 course, with a focus on working further with the core elements of Python and performing data analysis, using Jupyter notebooks and Anaconda. This course is designed to allow you to follow along with the content and examples as the course goes, but you will also be provided with course material to allow you to cover it again after the live event. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 50 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450614/
 
Title AI3SD Video: Understanding the solid form: Structural Systematics and Quantum Crystallography 
Description This talk forms part of the ML4MC (Machine Learning for Materials and Chemicals Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Directed Assembly Network. This series ran over summer 2021 and covers topics that encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. This video was the fourteenth talk in the ML4MC series and formed part of the session "Mentor Talks". 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 41 external views in addition to being part of our ML4MC Summer School which supported 31 PhD and Postdoc students in learning more about Machine Learning for Materials and Chemicals. 
URL https://eprints.soton.ac.uk/451136/
 
Title AI3SD Video: Using RDKit 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 2691 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450309/
 
Title AI3SD Video: Using Scopus and SciVal to track your research impact and find collaborators 
Description During this presentation, Chris James, a Senior Product Manager for Elsevier, will introduce Scopus and SciVal and demonstrate how the products can be used as part of your workflow to track your and others' research impact and identify peers for potential collaboration opportunities. Information gathered from these products can also be used to help support grant applications and identify relevant parallel areas of research. This session will be comprised of a short presentation, followed by a live demo and time for Q&A at the end. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 73 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/id/eprint/470005
 
Title AI3SD Video: Using icospherical input data in machine learning on the protein-binding problem 
Description Determining the binding coefficients of ligands to proteins is an essential step in targeted drug development. The 3-dimensional structure of both the protein binding pocket and the ligand is crucial in solving this problem. I will present ICOSPHERER (Icospherical Chemical Objects Surpassing Traditional A.I. Restrictions Through Replacing Existing Representations) a new methodology and software and demonstrate itâ??s use on the protein binding problem. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 33 external views in addition to being part of our AI4Proteins Series. This was a collaboration between AI3SD and RSC-CICAG to discuss key elements of AI 4 Proteins Research. 
URL https://eprints.soton.ac.uk/450160/
 
Title AI3SD Video: Welcome & AI4SD network retrospective 
Description The AI 4 Scientific Discovery Network, now rebranded as AI4SD has been running for nearly 4 years. This conference marks the end of our Network term and this talk will reflect on what AI4SD has achieved over the years, and where we are looking to go next. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 111 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/474022/
 
Title AI3SD Video: What a Medicinal Chemist Needs to Know about Explainable Artificial Intelligence 
Description The latest developments in artificial intelligence (AI) have arrived into an existing state of creative tension between computational and medicinal chemists. At their most productive, medicinal and computational chemists have made significant progress in delivering new therapeutic agents into the clinic. However, the relationship between these communities has the prospect of being weakened by application of over-simplistic AI methods which, if they fail to deliver, will reinforce unproductive prejudices. AI systems are action orientated; they suggest, and even automate, possible steps to take next in drug hunting projects. They do this by generating options, preforming predictions, then ranking the options. A key piece of critical learning is that any AI system for chemists should be open and â??explainableâ??. The requirement for the chemist to understand how models have been built and â??drill backâ?? to original data is key to explain how the computer has arrived at the prediction. For example, an AI system that aids the medicinal chemist in evaluating and designing new biologically active compounds should focus on communicating, in the preferred mode of the medicinal chemist, via critical substructures, and map to the original compounds and test data on which the model was built. Without this, any model becomes labelled as â??black-boxâ?? and confidence is reduced in the suggestions made. Therefore, to develop them further and to understand the quality of any prediction, transparency and auditability should be designed into a system from inception (see example below where the predictions are made for a compound and the contributions from the model highlighted). The talk will show practical examples from drug discovery projects and recent compounds from the global Covid Moonshot drug discovery program. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 108 external views in addition to being part of our Autumn Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/453338/
 
Title AI3SD Video: When charge transport data are a worm - a transfer learning approach for unsupervised data classification 
Description Advanced data analysis methodologies, and in particular dimensionality reduction techniques, are now used more and more widely in the single-molecule charge transport community. They allow for comprehensive exploration of large datasets, where data display significant variance and sometimes contain (unknown) sub-populations. To this end, unsupervised approaches, which do not rely on class labels or pre-defined expectations can be advantageous. Multi-Parameter Vector Classification (MPVC) is one example and PCA-based methods have also been employed in this context [1,2,3]. We have recently shown how Transfer Learning may be employed to identify and quantify hidden features in single-molecule charge transport data [3]. Using open-access neural networks such as AlexNet, trained on millions of seemingly unrelated image data, feature recognition then does not require network training with application-specific data. Instead, the network recognises features in the input that it had learned in other contexts and, for example, identifies different shapes in conductance-distance traces as images of different worm species. Thus, our results show how Deep Learning methodologies can readily be employed for unsupervised data classification, even if the amount of problem-specific, 'own' data is limited. [1] M Lemmer, MS Inkpen, K Kornysheva, NJ Long, T Albrecht, "Unsupervised vector-based classification of single-molecule charge transport data", Nat. Comm. 2016, 7, 12922. [2] T Albrecht, G Slabaugh, E Alonso, SMMR Al-Arif, "Deep learning for single-molecule science", Nanotechnology 2017, 28 (42), 423001. [3] A Vladyka, T Albrecht, "Unsupervised classification of single-molecule data with autoencoders and transfer learning", Mach. Learn.: Sci. Technol. 2020, 1, 035013. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Forms part of our YouTube Channel. Has received 91 external views in addition to being part of our Winter Seminar Series. After the success of our first seminar series, we started running regular online series to engage with our Network members worldwide. 
URL https://eprints.soton.ac.uk/448779/
 
Title AI3SD Video: Why you should take up PhD Opportunities in the Physical Sciences 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk on day 2 of the Skills4Scientists Posters & Careers Symposium, which was a virtual 2 day event on the 1st and 2nd of September (between AI3SD, PSDS & RSC-CICAG). Attendees were invited to present their research to a range of experts from industry and academia. A number of companies were also invited to present on the opportunities that they have to offer graduates and PhD students. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 36 external views in addition to being part of our Summer Seminar Series. This was the first online series we created to engage with our Network members worldwide during the COVID-19 pandemic. This series helped launch our YouTube channel which now has over 500 subscribers, and greatly increased our Network membership. 
URL https://eprints.soton.ac.uk/451463/
 
Title AI3SD Video: Writing a CV 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #6 - Careers 1 Session, which focussed on on several areas of careers advice that will be useful to you as you complete your studies and begin your careers. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 47 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450847/
 
Title AI3SD Video: Writing a good Abstract & Best Practices for Scientific Communication 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the first talk in the Skills4Scientists #5 - Posters, Presentations & Reports which focussed on several areas of communication for your research; presentations, posters and reports. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 36 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/450843/
 
Title AI3SD Video: Writing an ethics application 
Description This talk forms part of the Skills4Scientists Series which has been organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aims to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series is primarily aimed at final year undergraduates / early stage PhD students. This video was the second talk in the Skills4Scientists #7 - Ethical Research Session, which focussed on several areas of ethical research including discussions on why ethics is important and how to write an ethics application. 
Type Of Art Film/Video/Animation 
Year Produced 2021 
Impact Forms part of our YouTube Channel. Has received 34 external views in addition to being part of our Skills4Scientists Series which was aimed at educating final year undergraduates / early stage PhD students. 
URL https://eprints.soton.ac.uk/451154/
 
Title AI3SD video: AI standardization to enable digital development 
Description Standardisation is often viewed hand in hand with legislation but there are significant differences in the purpose each serves. While both look to deliver a common level of safety and trust for industry as well as consumers standards are voluntary in nature. The role of standards is to respond to common problems identified across industry, through collaborative consensus base processes. It looks to produce guidance to bridge gaps in understanding and practices across sectors and nations. In the fast evolving landscape of AI, standards looks to build a solid foundation of trust through common languages and understandings, for industries to evolve through innovation without hindering design or commercial favouritism. BSI hosts an established committee with an ever growing membership including an extensive range of representation from academia, industry, to regulators that champion UK positions at European and international level. This presentation will explore the landscape of how standards are developed and the impact it will have on the AI community. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 27 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/469295/
 
Title AI3SD video: AI4SD & IoFT AI for ethics working group: introducing the working group & our methodologies: moral IT cards & design fiction. 
Description Our Ethics for AI Working group was born out of a meeting held by the Internet of Food Things Network just before the pandemic to consider research challenges for developing a data trust system for the food sector. Our group came together through a shared interest in Ethics, and formed a working group to consider the ethical dimensions of digital collaboration in the food sector, such as the unintended consequence of AI. We have run several workshops and produced several papers as part of this work (1 published, 2 submitted awaiting review, 3 in progress). We used two key methodologies to explore these ethical issues: Design Fiction and the Moral IT Cards. Whilst our focus was on the food industry, we found these methods extremely useful in terms of considering different ethical aspects of technology and as such we believe they would be of great use to the wider scientific community. Our talk will introduce the working group and our activities, and explain how our methodologies can be put into practice for any Ethical Project. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 20 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/469296/
 
Title AI3SD video: Bayesian optimisation in chemistry 
Description Recent work on the problem of optimising the yield of a chemical reaction has focused on Bayesian optimisation methods. We extend this work in several directions by: determining the effect on the performance of the optimiser of altering the acquisition function and batch size; testing the robustness of the optimiser by applying it to other existing reaction yield data sets; and applying the optimiser to the new problem domain of molecular power conversion efficiency in photovoltaic cells. The talk will give an overview of the results obtained from the project, including how this may guide future developments in this area. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 74 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/469301
 
Title AI3SD video: curated large inorganic datasets of reconnected InChI, InChI and IUPAC name 
Description Speaking about my experience with the Skills4Scientists Seminar series on e-chemistry, networking and careers and the AI4SD internship itself. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 19 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/469299
 
Title AI3SD video: development of a full stack for digital R&D in chemistry and chemical process development 
Description In order to enable seamless access to AI tools in research, it is necessary to transform how our laboratories are equipped. AI requires access to data, and it takes too long to gain access and to clean up datasets. Our experimental hardware is not wired and is not accessible to algorithms. What is required is a development of data architecture that enables access to experimental and literature data both to a uman in the middle and fully algorithmic research tasks. In this talk I'll present our joint effort with the group of Prof Markus Kraft to implement knowledge graph for ML workflow in chemical synthesis development, and the work @ iDMT centre in Cambridge on expanding this to a fully digital R in molecular sciences. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 66 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/469294/
 
Title AI3SD video: harnessing advanced algorithms to enable the automated optimisation of telescoped chemical reactions; performance directed self-optimisation of bimetallic nanoparticle catalysts 
Description The catalytic performance of nanoparticles is dependent on an extensive number of properties, reactions conditions and combinations thereof; however, very few methods employing multivariate closed-loop optimisation of nanoparticle catalysts have been reported to date. Here we demonstrate a machine learning-driven reactor platform for the performance directed synthesis of nanoparticle catalysts. Our experimental strategy uses an automated two-stage continuous flow reactor with decoupled residence times, allowing the precise synthesis of gold-silver nanoparticles (AuAgNP) with variable metal compositions, and subsequent performance analysis using a 4-nitrophenol reduction reaction. Quantification of the reaction conversion using inline UV-Vis spectroscopy enables the direct observation of the catalyst performance in real time, providing an efficient response for the performance directed synthesis of the most catalytically active nanoparticles. This approach paves the way for the rapid synthesis and optimisation of new nanoparticle catalysts, thereby streamlining the development of sustainable chemical processes. In terms of algorithm development, as many real-world optimisation problems consist of multiple conflicting objectives and constraints which can be composed of both continuous and discrete variables, we are addressing a significant issue. Given the inherent nature of continuous variables, i.e. they have a real value in the desired optimisation range of the variable, they are often easier to explore with a wider array of applicable optimisation techniques. Discrete variables, however, can take the form of integer values or categorical values (materials, reaction solvents). In many cases these optimisations can be expensive to evaluate in terms of time or monetary resources. It is therefore necessary to utilise algorithms that can efficiently guide the search towards the optimum set of conditions for a given problem to reduce costs. He will harness a mixed variable multi-objective Bayesian optimisation algorithm to tackle the problem of simultaneously exploring continuous and discrete variable in the same optimisation, which if successful with represent a step change in this AI field. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 53 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/469297/
 
Title AI3SD video: internship talk - making music with automated processes and AI for the AI3SD Network 
Description Making the music for the AI3SD Network was an exciting and daunting project. In my talk I'll walk through the process I used to find and create ideas for the music, while talking about the sounds and technologies I used to make a successful musical package. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 26 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/469300
 
Title AI3SD video: internship talk-high-throughput generation of chemical isomers for the development of molecular models of biocrude oils 
Description The identification of chemical species in complex fluid materials like biocrude oils, is problem that can be largely solved by a computational optimisation of a molecular design space to expand the limited experimental data. This is specially useful due to the intrinsic difficulties to characterise this bitumen-like materials. We used available experimental data to generate molecular models of any biocrude oil from different biomass sources (e.g., chitin, coffee grounds, algae), and we expand the molecular space beyond the initial characterisation to constitute structural datasets for the training of machine learning (ML) algorithms. We have developed an algorithm to automate the generation of structural isomers for any given molecule, as well as to perform high throughput DFT calculations and to provide the lowest energy molecular structures, the associated electron density, and the reactivity of each atom in the molecules, which is used to suggest biocrude upgrading models. The DFT results will constitute a series of ab-initio-refined models and expanded datasets for training ML algorithms in the future. These models can be used for further computational studies, like molecular dynamics simulations. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 46 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/469298/
 
Title AI3SD video: physical sciences data infrastructure: shaping the physical sciences roadmap 
Description Digital technologies and computational resources are being utilised in scientific research and an increasing rate, however, the vast potential of these resources has yet to be realised. The physical sciences data infrastructure (PSDI) project aims to accelerate research in the physical sciences by providing a data infrastructure that brings together and builds upon the various data systems researchers currently use. This project is currently funded through the EPSRC digital research infrastructure funding to undertake a period of community engagement and requirements gathering. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of the AI4SD YouTube Channel. Has received 326 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. This video was a talk to introduce the Physical Sciences Data Infrastructure (PSDI) Initiative to introduce it to the AI4SD Community, many of whom were very interested in this project. 
URL https://eprints.soton.ac.uk/469293/
 
Title AI3SD video: the summer school, but not as we know it! 
Description How to engage with a summer school in the COVID 19 world. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 23 external views in addition to being part of our AI4SD Network Conference, that was run to mark the end of our Network Funding and report back on the activities of the Network. 
URL https://eprints.soton.ac.uk/id/eprint/469302
 
Title AI4SD Video: (Reproducible) Data Visualisation with R and how to make interactive things with R 
Description This video forms part of the 'Failed it to Nailed it' series. This series is run by the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Cell Press Patterns Journal and the Physical Sciences Data-Science Service (PSDS). In this talk Charlie introduced the bare bones of doing reproducible data visualisation with R using the {ggplot2} package. She also introduced both html widgets and shiny as tools for making interactive things with R. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 64 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/id/eprint/472635
 
Title AI4SD Video: Data Visualisation in Publishing & Communication 
Description This video forms part of the 'Failed it to Nailed it' series. This series is run by the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Cell Press Patterns Journal and the Physical Sciences Data-Science Service (PSDS) This talk will discuss different aspects of data visualisation in publishing. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 27 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/id/eprint/472638
 
Title AI4SD Video: Data Visualisation with Python 
Description This video forms part of the 'Failed it to Nailed it' series. This series is run by the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Cell Press Patterns Journal and the Physical Sciences Data-Science Service (PSDS) We'll be covering how to create different types of plots in Matplotlib, ranging from simple line graphs to interactive 3D visualisations, before finishing with a demonstration of how to integrate RDKit with Matplotlib in order to show interesting properties of chemical structures. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 25 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/id/eprint/472636
 
Title AI4SD Video: Introduction to Data Visualisation 
Description This video forms part of the 'Failed it to Nailed' it series. This series is run by the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Cell Press Patterns Journal and the Physical Sciences Data-Science Service (PSDS). This talk will provide an introduction to different aspects of data visualisation. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 31 external views in addition to being part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://eprints.soton.ac.uk/id/eprint/472631
 
Title AI4SD Video: Introduction to EPSRC and funding opportunities 
Description This presentation will cover a brief overview of UKRI and EPSRC highlighting the different funding opportunities at EPSRC and explain the application and peer review process. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 19 external views in addition to being part of our AI4SD ECR Conference, that was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. 
URL https://eprints.soton.ac.uk/id/eprint/472130
 
Title AI4SD Video: Introduction to equality, diversity and inclusion and development of your code of conduct 
Description This session will explore what equality, diversity and inclusion means, what EDI can look like in research and why this is important. The session will also have an interactive element to help you create a code of conduct for your event. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 6 external views in addition to being part of our AI4SD ECR Conference, that was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. 
URL https://eprints.soton.ac.uk/472341/
 
Title AI4SD Video: Opportunities for ECRs in the Royal Society of Chemistry 
Description Develop your career beyond your research by exploring your career options and how to get involved with the RSC. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 9 external views in addition to being part of our AI4SD ECR Conference, that was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. 
URL https://eprints.soton.ac.uk/id/eprint/472131
 
Title AI4SD Video: Reproducibility, Jupyter notebooks and associated research software engineering 
Description The talk will introduce the topic of reproducibility in science: what is reproducibility, why does it matter and why is to hard to achieve? It will discuss abstract requirements for reproducibility and try connect these to concrete measures we can take in day-to-day research to make our results more reproducible. The potential of Jupyter Notebooks and the Jupyter ecosystem is explored. This talk will be of interest to everyone who wants to ensure that the computational apsects of their research is reproducible and re-usable and to see how this can help meet the requirements of journals for manuscript submission and for UKRI Responsible Research and Innovation (RRI). Following good practice for reproducibility will also make research life easier, especially when writing up work. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 45 external views and was broadcast as a special AI4SD Seminar. 
URL https://eprints.soton.ac.uk/472779/
 
Title AI4SD Video: Research to Startup 
Description This video is the third talk that was given for the AI4SD ECR Event 2022. Samuel Munday is a University of Southampton graduate who is now an ECR in the Chemistry department and co-founder of Data Revival. In this talk he describes his journey so far of taking cutting edge AI research to the market, the hurdles he's faced along the way and some of the fantastic support he's had from within the University. He goes into detail of how the idea for the spin out started, how he used the enterprise ecosystem within Southampton to both validate the viability of the proposition and to upskill in all things business before finally finishing with what the future holds for him and the company as they move away from the University. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 14 external views in addition to being part of our AI4SD ECR Conference, that was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. 
URL https://eprints.soton.ac.uk/472332/
 
Title AI4SD Video: SciData: Semantic representation of scientific data and applications in chemistry 
Description Findable, Accessible, Interoperable and Reusable (FAIR) is a paradigm shift in how we should make our research data available and useful for other scientists. The SciData framework (https://stuchalk.github.io/scidata/) is a specification for constructing JavaScript Object Notation for Linked Data (JSON-LD) that are semantically encoded, giving meaning to research data and its contextual metadata. This talk will cover the basics of how to create SciData JSON-LD using Python, some example data files for different types of data, use of the file format for the basis of a digital research notebook, and digital chemical twins. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 44 external views and was broadcast as a special AI4SD Seminar. 
URL https://eprints.soton.ac.uk/471458/
 
Title AI4SD video: Create a killer CV 
Description Your CVs role is to get you an interview.... and that's it. Find out how to create a killer CV. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 11 external views in addition to being part of our AI4SD ECR Conference, that was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. 
URL https://eprints.soton.ac.uk/id/eprint/472134
 
Title AI4SD video: Transitioning to industry: A long, short road 
Description Academic and industrial research are different environments in many ways, yet still share many points of similarity. Some people make a permanent leap, while some decide to leap backwards and forwards every few years, or even comfortably straddle both worlds. Will shall be discussing the pros, cons, trials and tribulations in a very candid way, touching on general skills, but also personal experience. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Forms part of our YouTube Channel. Has received 9 external views in addition to being part of our AI4SD ECR Conference, that was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. 
URL https://eprints.soton.ac.uk/id/eprint/472133
 
Title The (long) journey from supporting information to Publishing and Finding FAIR data in chemistry 
Description Electronic supporting information had its origins in the early to mid 1990s and it has evolved in a highly ad hoc manner since then. The concept of FAIR data arose about five years ago to try in part to rationalise the chaotic state of ESI. The talk will illustrate these developments by presenting a case study illustrating how one (either human or AI) might use the properties of FAIR to "F"ind some highly focused chemical spectroscopic and computational data. I will conclude by trying to unpick some of the supporting infrastructures which enable this and how the creators of the data facilitate this by using metadata to describe and then publish the data. The talk incorporates some elements of FAIR by having its own metadata and its own persistent identifier (as a DOI): https://doi.org/ff6g so that you can yourself Find, Access, Interoperate, Re-use and Cite it as appropriate. 
Type Of Art Film/Video/Animation 
Year Produced 2020 
Impact Part of our Failed it to Nailed it Seminar Series. This was a collaboration between AI3SD, PSDS and Patterns to educate on and discuss different aspects of data sharing based on a survey run in 2020. 
URL https://data.hpc.imperial.ac.uk/resolve/?doi=7629
 
Description A recurring feature at many of the AI3SD Network+ meetings is the lack of the necessary amounts of quality chemical and materials data with which to build AI and ML models.

Interdisciplinary working is key with respect to projects that work across the domains of chemistry and computer science, and it is vital that we equip our scientists with the necessary technical skills and underpinnings to ensure that the software they produce is robust and of good quality.

We also learnt a lot about running a successful, engaging and diverse interdisciplinary Network.
Exploitation Route The use of cutting edge AI to further cutting edge scientific development will facilitate academic and industry in meeting the challenges of the UN SDGs. Engaging the AI/ML community in academic science is hard but when achieves has been highly rewarding and taking the science and AI/ML forward. Emphasis on quality data, need for trust, ethical approaches are all areas that can be taken forward. The quality and reliable software is also essential to continue the growth of AI in Science. The UKRI AI scene has become broad and extensive and the lessons from the Network will be vital to the success of the field.

Based on our findings regarding the need for training and education. We also ran two Machine Learning Summer Schools, one virtually in Summer 2021 and one as a hybrid event in Summer 2022. Across these two schools we supported over 120 students to upskill them in different aspects of Machine Learning and other useful skills drawn from our Skills4Scientists series.

The AI4SD Team worked together with the Network of Networks to produce a resource on running a Network: https://network-mgmt.ai3sd.org/ . This resource has been produced by a group of diverse research management professionals, representing different disciplines and organisations to aid Network Managers and Investigators in the creation and management of research communities. They have drawn together their collective experience in managing different aspects of these NetworkPluses to produce advice and tips for best practices.

ED&I is clearly an issue for every Network+. At AI4SD we have tried to address it, although it is very hard to track it in a non-invasive manner. As part of our work to address ED&I issues, we collaborated with the Network of Networks Group to run an ED&I Survey across 2021 and 2022 and have been using the results to influence how we run our Network. We have also run and taken part in ED&I workshops, in addition to running an ECR workshop with an ED&I discussion group.
The results of our ED&I survey were written up as part of a conference presentation: Chandler-Wilde, S., Kanza, S., Fisher, O., Fearnshaw, D. and Jones, E., 2022. Reflections on an EDI Survey of UK-Government-Funded Research Networks in the UK. We hope these findings will be useful to others in the future.
Sectors Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

URL https://www.ai4science.network/
 
Description The Network membership is now over 1500 (via the email list) with over 900 followers on Twitter and over 24,000 views of the YouTube videos (The channel has over 800 subscribers). These membership numbers are from March 2023. The membership is drawn from academia, industry, and government, and has a significant international component. The network online events and the recent hybrid conference demonstrated both the breadth and extend of the interactions promoted by the network, breaking down silos and promoting cutting edge work in science and computer science, and showing the future of scientific discovery in the digital age. Throughout the Network term, we funded 7 pilot projects. We also ran two Machine Learning Summer Schools, one virtually in Summer 2021 and one as a hybrid event in Summer 2022. Across these two schools we supported over 120 students to upskill them in different aspects of Machine Learning and other useful skills drawn from our Skills4Scientists series. We asked the people we funded to provide an impact statement about how our funding helped them and surveyed the summer school students to understand their experiences. Overall, our funding seems to have had a very positive impact on those who received it. Most relevant impact statements: Keith Butler (Funded in our second funding call): The work funded by AI3SD formed the basis of an on-going programme of work that has since been funded by an Innovate A4i grant and an STFC Cross-Cluster grant with the ISIS Neutron and Muon Source. We have also published a paper based on the work started with this grant (https://www.nature.com/articles/s41524-021-00542-4)* and the methods developed are being applied by Finden in analysis of real industrial tomography data. Paul Dingwall (Funded in our second funding call): Most importantly though, this funding has allowed me to build a collaboration with a colleague in Computer Science at QUB. There are very few instances, that I am aware of, of collaboration across Chemistry and Computer Science and this will hopefully herald the start of a long a fruitful partnership. Dr Grant Hill (Funded as part of our second funding call): The funding of our project has helped define new research directions for my group, all of which are associated with artificial intelligence and/or automated design in the physical sciences. The project itself was concerned with the design of new materials, but the skills and knowledge acquired during this project has been incorporated into other activities, which has led to a successful EPSRC proposal on machine learning basis sets in quantum chemistry (grant EP/T027134/1). The visibility afforded by the award and associated network activities, including invited conference talks, has enabled me to grow my personal network of researchers using AI for scientific discovery and has culminated in me leading a "Creating and using molecules and materials" interest group of the Sheffield AI network (affiliated with the Turing institute). not sure how much of that you want to put i by Samantha Kanza Additional Impact statements from our funded projects can be found in our final report: https://eprints.soton.ac.uk/474022/, which include applicants applying for further funding based on the results, publications and collaborations created as part of their funded AI4SD pilot projects. The success of the network has encouraged many applications to the UK AI Hubs in Chemistry and Materials and promoted the statement of need for the Physical Sciences Data Infrastructure project (PSDI) which is now being funded.
First Year Of Impact 2021
Sector Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Education,Energy,Government, Democracy and Justice,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Cultural,Economic

 
Description An EPSRC National Research Facility to facilitate Data Science in the Physical Sciences: The Physical Sciences Data science Service (PSDS)
Amount £2,996,067 (GBP)
Funding ID EP/S020357/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2019 
End 01/2024
 
Description Arctoris 
Organisation Arctoris
Country United Kingdom 
Sector Private 
PI Contribution We organised and ran some online event series that Arctoris sponsored.
Collaborator Contribution Arctoris sponsored our AI4Proteins series - http://www.ai3sd.org/ai4proteins and our Autumn Seminar Series: https://www.ai3sd.org/ai3sd-online-seminar-series/autumn-seminar-series-2021/ They have also given talks at various events.
Impact Arctoris are still interested in sponsoring further relevant events and being involved with our Network.
Start Year 2021
 
Description BSI 
Organisation British Standards Institute (BSI Group)
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution We ran an event which BSI sponsored, exhibited at and gave a talk at.
Collaborator Contribution BSI sponsored and exhibited at our AI4SD Conference and gave a talk about standards for AI/ML.
Impact BSI are interested in working with us in the future, including sponsoring/exhibiting at future events. We are also planning to run a standards workshop with them.
Start Year 2022
 
Description CAS 
Organisation American Chemical Society
Country United States 
Sector Academic/University 
PI Contribution We have run a few events which CAS exhibited at, sponsored, and gave a talk at.
Collaborator Contribution CAS Sponsored our AIReact2020 and AI4SD2022 Conferences. They gave talks at both.
Impact CAS are interested in working with us in the future, including sponsoring/exhibiting at future events.
Start Year 2020
 
Description CellPress Patterns 
Organisation Elsevier
Department Cell Press
Country United States 
Sector Private 
PI Contribution We helped organise the series, ran the events and produced some of the presentations for them. We have made all the outputs available via our Website and YouTube Channel.
Collaborator Contribution We collaborated with PSDS and Patterns to run a Failed it to Nailed it series about many different aspects of dealing with data: https://www.ai4science.network/ai3sd-online-seminar-series/data-seminar-series-2020/. Members of the patterns team presented at some of these meetings and helped organise and run them.
Impact The full list of outputs can be found here: https://www.ai4science.network/ai3sd-online-seminar-series/data-seminar-series-2020/ With the formal DOI: 10.5258/SOTON/AI3SD0270
Start Year 2020
 
Description Digital Discovery 
Organisation Royal Society of Chemistry
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution We ran an event which Digital Discovery sponsored, and gave a talk at.
Collaborator Contribution Digital Discovery sponsored our AI4SD Conference and gave a talk about Open Access Publishing
Impact Digital Discovery are interested in working with us in the future, including sponsoring/exhibiting at future events. They are also interested in commissioning some articles form us based on our research.
Start Year 2022
 
Description Dotmatics 
Organisation Dotmatics Limited
Country United Kingdom 
Sector Private 
PI Contribution We ran some events that Dotmatics sponsored and exhibited at.
Collaborator Contribution Dotmatics sponsored our Skills4Scientists careers and posters symposium and gave a recruitment talk. - https://www.ai3sd.org/ai3sd-event/01-02-09-21-skills4scientists-poster-symposium/ (sponsored and gave a recruitment talk) Dotmatics also sponsored and exhibited at our AI4SD Conference and gave a talk about their work.
Impact Dotmatics are interested in working with us in the future, including sponsoring/exhibiting at future events. Dotmatics are moving into adding semantics to their ELN which is an area our Network is passionate about, and so we expect further collaboration to occur.
Start Year 2021
 
Description Internet of Food Things 
Organisation Food Standards Agency (FSA)
Country United Kingdom 
Sector Public 
PI Contribution The ITaaU Network brought the ideas of the Digital Economy and in particular the potential role of digital and IT utilities to all aspects of the food network
Collaborator Contribution The FSA brought the concerns, data and information about the Food safety and security
Impact We have held several joint workshops and have produced a report on IoT and Food
Start Year 2014
 
Description Merck 
Organisation Merck
Department Merck UK
Country United Kingdom 
Sector Private 
PI Contribution We organised and ran an event that Merck sponsored, exhibited at and spoke at.
Collaborator Contribution Merck sponsored and exhibited at our event: https://www.ai3sd.org/aireact2020/ - https://www.ai3sd.org/ai3sd-online-seminar-series/ml4mc-seminar-series-2021/
Impact Merck are still interested in sponsoring further events and being involved with our Network.
Start Year 2020
 
Description Mettler Toledo 
Organisation Mettler-Toledo Auto Chem
Country United States 
Sector Private 
PI Contribution We organised and ran an event that Mettler Toledo sponsored, exhibited at and spoke at.
Collaborator Contribution Mettler Toledo sponsored and exhibited at our event: https://www.ai3sd.org/aireact2020/ - https://www.ai3sd.org/ai3sd-online-seminar-series/ml4mc-seminar-series-2021/
Impact Mettler Toledo are still interested in sponsoring further events and being involved with our Network.
Start Year 2020
 
Description Optibrium 
Organisation Optibrium Ltd
Country United Kingdom 
Sector Private 
PI Contribution We organised and ran an event (The Skills4Scientists careers and posters symposium) which Optibrium sponsored.
Collaborator Contribution Optibrium sponsored our Skills4Scientists careers and posters symposium and gave a recruitment talk. - https://www.ai3sd.org/ai3sd-event/01-02-09-21-skills4scientists-poster-symposium/ (sponsored and gave a recruitment talk)
Impact Optibrium are still interested in sponsoring further relevant events and being involved with our Network.
Start Year 2021
 
Description Pistoia Organisation 
Organisation Poverty Alliance
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Network organisation and ideas
Collaborator Contribution Network organisation,m running sessions and ideas
Impact Contributions to several Network meetings
Start Year 2019
 
Description RSC-CICAG 
Organisation Royal Society of Chemistry
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution We have collaborated with the RSC-CICAG (Royal Society of Chemistry - Chemical Information and Computer Applications Group) over several events. We organised and ran the following joint events: - https://www.ai3sd.org/ai3sd-event/31-01-2020-ai3sd-osm-rsc-cicag-ai-and-ml-in-drug-discovery-predicting-bioactive-molecules-when-there-is-no-target/ - https://www.ai3sd.org/ai3sd-event/14-04-2021-ai3sd-winter-seminar-series-proteins/ - https://www.ai3sd.org/ai3sd-event/05-02-2021-ai3sd-rsc-cicag-ai-4-proteins-ii/ - https://www.ai3sd.org/ai3sd-event/05-02-2021-ai3sd-rsc-cicag-ai-4-proteins-iii/ - https://www.ai3sd.org/ai3sd-event/17-06-2021-ai3sd-rsc-cicag-protein-structure-prediction/ - https://www.ai3sd.org/ai3sd-event/01-02-09-21-skills4scientists-poster-symposium/ AI4SD organised the events and contributed funding. AI4SD contributed the report for the AI4Proteins event which we commissioned - https://eprints.soton.ac.uk/452733/ AI4SD produced videos of all of the online events AI4SD regularly contribute to the RSC-CICAG newsletters
Collaborator Contribution RSC-CICAG contributed time to organising these events, and linked us up with a number of their contacts in industry to be sponsors and speakers at the events. Dr Chris Swain wrote the report for our first event together - https://eprints.soton.ac.uk/438123/ RSC-CICAG help spread our news via their newsletters
Impact This collaboration is multi-disciplinary spanning across all the physical sciences, with a particular focus on chemistry and computer science. Reports - AI3SD, OSM & RSC-CICAG Predicting the activity of Drug Candidates when there is no target Workshop Report - http://dx.doi.org/10.5258/SOTON/P0020 - AI 4 Proteins: Protein Structure Prediction - http://dx.doi.org/10.5258/SOTON/AI3SD0176 Videos - AI4Proteins Series: https://www.youtube.com/watch?v=_LNsLNnEx7E&list=PLyeHH3bEQqIYL-qAtu5BadBUzbpFajIQ9&ab_channel=AI4ScientificDiscovery - Skills4Scientists Posters: https://www.youtube.com/watch?v=FI6imNOdw6g&ab_channel=AI4ScientificDiscovery Posters - https://www.ai3sd.org/s4s-symposium2021/posters/
Start Year 2020
 
Description Reaxys 
Organisation Elsevier
Country Netherlands 
Sector Private 
PI Contribution We organised and ran an event that Reaxys sponsored, exhibited at and spoke at.
Collaborator Contribution Reaxys sponsored and exhibited at our event: https://www.ai3sd.org/aireact2020/ - https://www.ai3sd.org/ai3sd-online-seminar-series/ml4mc-seminar-series-2021/
Impact Reaxys are still interested in sponsoring further events and being involved with our Network.
Start Year 2020
 
Description AI & ML in Chemical Discovery & Development 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact [15:04] Kanza S.

This was a joint networks event between AI3SD, Dial-a-Molecule, Directed Assembly Network and the University of Leeds. This was a residential event aiming to bring together stakeholders with different backgrounds, e.g.academic/industry, researchers/data owners, and chemists/engineers/computer scientists, to discuss applications of AI and Machine Learning in Chemical Discovery and Development. The event was made up of some priming talks to stimulate discussion, and a series of structured discussion sessions over the two days to form a general consensus on some key objectives and milestones to deliver the promised impacts of these important tools within the remit of the three networks.
Year(s) Of Engagement Activity 2019
URL https://eprints.soton.ac.uk/434338/
 
Description AI for Allergen Detection 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact This event was run by AI3SD (Artificial Intelligence and Augmented Intelligence for Automated Investigations for Scientific Discovery) and the IoFT (Internet of Food things) Networks. It was organised in conjunction with the Food, Water and Waste Research Group run by Dr Nicholas Watson from the University of Nottingham. This workshop was centred around the usage of Artificial Intelligence in Allergen Detection and Smart Cleaning within Food Production; research areas that co-align between both AI3SD & IoFT. The programme was made up of several presentations that were designed to report on the current state of affairs, and consider where we need to be going in the future. There were five main working group topics identified for this workshop, and talks were be given on the different aspects that needed to be considered with respect to allergen detection and smart cleaning. Following this, the event broke up into the working groups for more formal discussions. There were multiple sessions for the working group discussions, ensuring that there were opportunities to take part in as many group discussions as the attendees wish. The workshop will be formally recorded and the suggestions for going forward will be captured in a position paper.There was plenty of time for networking, as there was both a lunch and drinks session included as part of the day.
Year(s) Of Engagement Activity 2019
URL https://eprints.soton.ac.uk/437031/
 
Description AI in Drug Discovery and Drug Safety Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Drug discovery is a long and long-term scientific investigation involving interdisciplinary research methods coupled with large heterogeneous datasets. The research and data space in this area is vast, and AI3SD and MDC believe that the use of AI and machine learning technologies can help spur on advances in this domain. The current workshop was designed to draw together those with a keen interest in using AI and machine learning technologies in the domain of drug discovery, both to aid future drug discovery, and to help improve drug safety. AI3SD firmly believes that interdisciplinary collaboration is the key to many of these advances. At the workshop, keynote talks were interspersed with general group discussions and working groups around the key topics that arose.
Year(s) Of Engagement Activity 2019
URL https://eprints.soton.ac.uk/432067/
 
Description AI in Materials Discovery Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This workshop was a half-day event hosted at the University of Southampton, with the focus of the event being the use of AI in materials discovery and innovation. The event was structured with an introductory talk followed by 5 keynote sessions from experts in the materials field. These talks were separated by a coffee break giving participants time to network. This workshop is one in a series of events hosted by the AI3SD Network designed to provide a platform for discussion, innovation and collaboration for using AI in scientific discovery. The events bring together researchers and experts to disseminate knowledge and allow new connections to flourish.
Year(s) Of Engagement Activity 2019
URL https://eprints.soton.ac.uk/432068/
 
Description AI3SD & Directed Assembly ML4MC Summer School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact In the summer of 2021, AI3SD teamed up with the Directed Assembly Network to run a virtual summer school on Machine Learning 4 Materials & Chemicals which encompass our overlapping Network interests of AI, Machine Learning, Artificial Photosynthesis, Biomimetic Materials, Crystal Design & Engineering, Materials, Molecules, Photochemistry, Photocatalysis and Supramolecular Chemistry. Overall, most students said they were very satisfied with the summer school, but they gave us some useful feedback which we put into place for our second summer school. The students noted that they wanted the school to start with some more introductory sessions and raised a desire for some more hands on practical sessions with more interaction with their mentors. When we asked if they would want the school to run physically virtually or hybrid the year after they all voted for either in person or hybrid.
Year(s) Of Engagement Activity 2021
URL https://www.ai3sd.org/ai3sd-online-seminar-series/ml4mc-seminar-series-2021/
 
Description AI3SD & PSDS Skills4Scientists Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact This series was organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI3SD) and the Physical Sciences Data-Science Service (PSDS). This series ran over summer 2021 and aimed to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. This series was primarily aimed at final year undergraduates / early stage PhD students.
Year(s) Of Engagement Activity 2021
URL https://eprints.soton.ac.uk/453198/
 
Description AI3SD & RSC-CICAG AI4Proteins Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact AI3SD collaborated with RSC-CICAG (The Royal Society of Chemistry - Chemical Information and Computer Applications Group) to run an #AI4Proteins Seminar Series in 2021.
Year(s) Of Engagement Activity 2021
URL https://eprints.soton.ac.uk/452733/
 
Description AI3SD Autumn Seminar Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact After the success of our summer and winter seminar series we decided to run an autumn series. These sessions comprised of 2-3 talks each, and we ran 10 sessions between October - December 2021.
Year(s) Of Engagement Activity 2021
URL https://www.ai3sd.org/ai3sd-online-seminar-series/autumn-seminar-series-2021/
 
Description AI3SD Machine Learning for Chemistry Training Workshop & Hackathon 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact This training workshop and hackathon was intended to help upskill scientists in AI and Machine Learning techniques for Chemistry and provide some challenges to test out their new skills.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/ai3sd-machine-learning-for-chemistry-training-workshop-hackathon-tick...
 
Description AI3SD Network+ Conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact The second big conference was run in November 2019 to showcase our first year of activities and report on our funded projects. This conference ran over 2 days at the Winchester Holiday Inn, and featured some keynote industry speakers including Dr Andrew Senior from DeepMind to discuss Alphafold and Dr Richard Tomsett from IBM who talked about AI and Explainability. The conference also featured keynotes from academics working in highly relevant areas such as Dr Lucy Colwell from the University of Cambridge discussing protein prediction and Professor Juan Garrahan from the University of Nottingham discussing his work in non-equilibrium physics and machine learning methods. There was a wide range of additional presentations and a number of posters were also presented, demonstrating the wide range of work in the AI for Scientific Discovery area and showcasing the need for the Network to bring these communities together. A blog post about the event can be found on our website: https://www.ai4science.network/2019/11/18/ai3sd-conference-blog-post/ written by Michelle Pauli who also produced a formal report on the event: https://eprints.soton.ac.uk/444601/
Year(s) Of Engagement Activity 2019
URL https://eprints.soton.ac.uk/444601/
 
Description AI3SD Network+ Town Meeting & Funding Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact This meeting was organised to provide useful information about both the AI3SD-FundingCall2 and funding call applications in general. There were talks from experts in the areas of IP for AI, there were be a top tips for writing your funding application session where we provided advice on strengthening your applications based on previous experience of reviewing funding applications. There was also be an opportunity to ask questions about our second funding call and to find other people / institutions to collaborate with. All interested parties were strongly encouraged to come along, as we hoped that this event would not only answer any questions people have, but also offer an opportunity to match up companies with academic institutes for collaboration on projects proposals. All questions and answers were written up and added to our Funding Call Page as an FAQ.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/ai3sd-network-town-meeting-funding-workshop-tickets-65421219629
 
Description AI3SD Seminar: Quantum Computers: a guide for the perplexed 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Andy Stanford-Clark introduced the mind-bending principles of quantum computing, give some history of the technology, and describe potential application areas for quantum computers. He took us on tour inside a real quantum computer, and explained how you can get free hands-on experience of IBM's quantum computer, and start to learn how to program these exciting new machines.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/ai3sd-seminar-quantum-computers-a-guide-for-the-perplexed-tickets-758...
 
Description AI3SD Summer Seminar Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact During the first COVID-19 Lockdown we organised a virtual summer seminar series. This comprised 14 talks: https://www.ai3sd.org/ai3sd-online-seminar-series/summer-seminar-series-2020/. We vastly expanded our Network membership and engagement reach through this series.
Year(s) Of Engagement Activity 2020
URL https://eprints.soton.ac.uk/453248/
 
Description AI3SD Winter Seminar Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact After the success of our summer seminar series we decided to run a winter series during the second lockdown. These sessions comprised of 2-3 talks each, and we ran 10 sessions between November 2020 and April 2021.
Year(s) Of Engagement Activity 2021
URL https://www.ai3sd.org/ai3sd-online-seminar-series/winter-seminar-series-2021/
 
Description AI3SD, Dial-a-Molecule & Directed Assembly: AI for Reaction Outcome and Synthetic Route Prediction 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact This was a joint meeting between the Dial-a-Molecule, Directed Assembly and AI3SD (Artificial Intelligence and Augmented Intelligence for Automated Investigations for Scientific Discovery) Networks. The meeting will examine the state of the art and future opportunities in the use of Artificial Intelligence to predict the outcome of unknown chemical reactions, and consequently design optimum synthetic routes to desired molecules. A wide variety of AI approaches will be illustrated including expert systems, statistical methods, mechanism based and Machine Learning.
Year(s) Of Engagement Activity 2020
URL https://eprints.soton.ac.uk/441628/
 
Description AI3SD, OSM & RSC-CICAG: AI and ML in Drug Discovery: Predicting Bioactive Molecules when there is No Target 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact This one-day meeting concerns the application of machine learning/artificial intelligence (ML/AI) approaches to the discovery of new drug leads. Specifically the meeting is about cases where the biological target is not clearly established - so-called phenotypic drug discovery. The meeting centers on a real example - a competition run by Open Source Malaria (OSM), funded by a grant from the EPSRC/AI3SD+ Network. Data on active and inactive compounds in one OSM antimalarial series were published online, and anyone was able to submit a model able to predict the actives. The models were judged against a dataset that was kept private, and the winners were asked to use their models to predict novel molecules. These are currently being made in the lab and biologically evaluated, and the results will be reported at the meeting, providing a real-world test, and a complete case study, of the capabilities of ML/AI approaches to accelerate modern drug discovery.
We will hear from some of the eleven competition entrants about how their models were constructed, and will have other presentations on related developments. We hope during this meeting to establish which approaches worked well, which did not, and why. All those interested in the application of ML/AI methods to drug discovery are encouraged to attend.
Year(s) Of Engagement Activity 2020
URL https://eprints.soton.ac.uk/438123/
 
Description AI4SD ECR Meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact In July we held a 2 day hybrid event at Chilworth Manor in Southampton. This event was for Early Career Researchers working across the domains of Computer Science and Chemistry. It was specifically designed to inform, upskill and facilitate networking between Early Career Researchers. The event contained talks on scientific publishing, ED&I, grant and fellowship applications, advice on CVs, networking and much more. There was also a dedicated time for networking.
Year(s) Of Engagement Activity 2022
URL https://www.ai4science.network/ai3sd-online-seminar-series/ai4sd-ecr-series-2022/
 
Description AI4SD Machine Learning Summer School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Taking the feedback on board from the previous year, we ran our 2022 summer school for over 100 students (70 in person) as a week-long hybrid course at the University of Southampton and on Zoom. We organised some introduction to python sessions at the beginning of day 1 with an example Jupyter Notebook as was requested in the previous feedback. As with the previous year we split the students into teams for a hackathon, we brought in helpers for the hackathon sessions so that the groups could have hands on help.
Again, the summer school was very well received overall. There was still feedback that some students would have liked more introductory material at the beginning, especially more python training. Next time we run one of these events we are considering either running multiple ones at different levels, or potentially assigning a full 2 days to hands on phyton training at the beginning.

We also asked students what other sorts of events they would like to see us run, and there was a clear appetite for more summer schools, events on Python and ML for chemical applications, more basic training courses, sessions on other coding languages such as R (which we subsequently addressed in one of our latest events).
Year(s) Of Engagement Activity 2022
URL https://www.ai4science.network/ai3sd-online-seminar-series/ai4sd-machine-learning-summer-school-2022...
 
Description AI4SD Network Conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact We ran a hybrid conference in March 2022 to round up the activities of the Network and talk about the future. This was a three day hybrid event at Chilworth Manor Hotel that opened with a retrospective from our PI Professor Jeremy Frey on our Network. The event featured a number of diverse sessions considering AI and ML in medicine, chemicals, molecules and materials, and talks on different techniques in these technologies, and other important aspects such as opacity, explainable AI and ethics. The event also featured three panel sessions which generated a lot of lively conversation, and presentations from a number of ECRs and PhD students who were funded by the Network either for pilot projects or internships. This conference was sponsored by Digital Discovery, CAS, Patterns, BSI and Dotmatics. A formal report was written on this event by Dr Wendy Warr and is available here: https://eprints.soton.ac.uk/471408/.
Year(s) Of Engagement Activity 2022
URL https://www.ai4science.network/ai3sd-online-seminar-series/ai4sd-conference-video-series-2022/
 
Description Artificial and Augmented Intelligence for Automated Scientific Discovery Network Launch Meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Launch meeting attended by 120 people from academia, industrial, commercial, and governmental organisations raised considerable interest in the Network activities and the membership (as defined by the email list) is now 300.
Year(s) Of Engagement Activity 2018
URL https://eprints.soton.ac.uk/427810/
 
Description Exhibited for AI4SD at BSI Standards Awards 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Exhibited for AI4SD to raise awareness of the Network.
Year(s) Of Engagement Activity 2022
 
Description Failed it to Nailed it Seminar Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact This 'Failed it to Nailed it - Getting Data Sharing Right' series is a series of events run by the Artificial Intelligence for Scientific Discovery Network+ (AI3SD), the Cell Press Patterns Journal and the Physical Sciences Data-Science Service (PSDS). These events are a product of the data sharing survey we ran in 2020.
Year(s) Of Engagement Activity 2020
URL https://www.ai3sd.org/ai3sd-online-seminar-series/data-seminar-series-2020/
 
Description How to Use AI for Good? The Ethical and Societal Implications of Using AI in Scientific Discovery - an AI3 Science Discovery Network+Workshop at the WebSci'20 Conference 07/07/2020 AI3 Science Discovery Network+ & WebSci'20 WebSci'20 Online Conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact This workshop was run by AI3SD (Artificial Intelligence and Augmented Intelligence for Automated Investigations for Scientific Discovery) as part of the ACM Web Science 2020 Conference. This workshop was run online via zoom as it took place during the COVID-19 pandemic lockdown. The programme was made up of several related presentations that were designed to get the attendees thinking about different ethical considerations for AI in Scientific Discovery. Each presentation was followed by a short discussion session where the attendees could ask the speaker questions and could discuss the main topics of the presentation. The workshop also included an interactive activity using Moral IT Cards which were designed as a tool to help reflect on different ethical issues.
Year(s) Of Engagement Activity 2020
URL https://sites.google.com/site/ai3sdusingaiforgood/provisional-programme
 
Description ISTCP 2019: 10th Triennial Congress of the International Society for Theoretical Chemical Physics 11-17 July 2019 Tromsø, Norway 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact AI3SD funded an ECR from the Network to attend this conference, who produced a report.

This seven-day conference mainly consisted of a mixture of plenary lectures and parallel presentation sessions, with two poster sessions and a session on European Research Council funding. Each day had two coffee breaks and time for lunch, which, combined with the poster sessions, allowed time for networking. The conference was broad in its overall scope - covering most aspects of theoretical chemical physics / physical chemistry, but three sessions were dedicated to machine learning / AI. A number of talks within other sessions also contained significant AI for chemical discovery content too.
Year(s) Of Engagement Activity 2019
URL http://istcp-2019.org/program.html
 
Description Molecules Graphs & AI Meeting, 6th February 2018, Ageas Bowl Southampton 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop to understand the state of the art in the use of molecular graphs to predict molecular and material properties
Year(s) Of Engagement Activity 2019
URL http://www.ai3sd.org/events/ai3sd-event-list/ai3sdmoleculesgraphsai
 
Description PLA2019, 7th Edition 9th - 10th April 2019 Grand Hotel Dino 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact This event was run by the Paperless Lab Academy which is organized by NL42 Consulting and is the 7th iteration of this congress, this year hosted again at Lake Maggiore in Italy. The central theme for this edition of the conference was #eDataLifeCycle @Work - turn your laboratory into a data-driven knowledge center. The congress was a two-day event comprised of a number of presentations, interactive workshops, and an expert panel discussion session. It was also possible to attend pre-congress training sessions on the day before the event. In the main congress the talks all ran consecutively so it was possible to attend each one, although the workshops ran 2-3 to a workshop session.

Members of AI3SD attended this event, networked with the attendees and produced a report.
Year(s) Of Engagement Activity 2019
URL https://www.paperlesslabacademy.com/agenda-2019/
 
Description PhenoHarmonIS Conference, 14-18th May 2018, Agropolis Scientific Park 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact PhenoHarmonIS is an invite only week long workshop run every two years that is focused on harmonizing agronomic data using semantic web technologies. The scientific domains that were represented in this workshop are conservation, breeding, crop traits, agronomy and agro-ecology. The invited participants ranged from agronomists in many different areas of agriculture to data scientists and ontologists working in the agricultural domain. The participants were encouraged to provide feedback on the standards and tools that they use in their data management and analysis to identify which tools are useful, and where the issues and gaps currently lie. The workshop was also intended to assess progress made since the last PhenoHarmonIS conference in 2016, to see what tools and standards have been adopted and developed by the community. The workshop was made up of presentations on a range of topics in the semantic agro space, interactive breakout sessions and some visits to other locations to demonstrate the use of technology in agriculture. The main presentations were all run sequentially such that it was possible to attend each one. A number of breakout sessions were run in parallel to one another across several days, and some ran across multiple sessions. The breakout sessions also included flash talks and presentations that linked specifically to that section (either a presentation of the tool to be discussed and evaluated, or projects that were related to the breakout topic). There were also two social events and plenty of time for networking and discussions outside of the presentations and breakout sessions. This was the format monday through thursday and the friday was used for specific project group meetings.

Members of AI3SD attended this event, networked with the attendees and produced a report. SK presented about a project we were working on in agricultural ontologies and was given some useful feedback.
Year(s) Of Engagement Activity 2018
URL https://sites.google.com/a/cgxchange.org/cropontologycommunity/2018-phenoharmonis
 
Description Presented AI4SD Poster at Royal Society of Chemistry Meeting 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Presented our AI4SD poster at the RSC Ultra Large Chemical Libraries Meeting.
Year(s) Of Engagement Activity 2022
URL https://www.rsc.org/events/detail/73675/ultra-large-chemical-libraries
 
Description Re-Coding Black Mirror Workshop 30/01/2019 CPDP 2019 (Computers, Privacy & Data Protection Conference) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The intriguingly named 'Re-Coding Black Mirror' was a one day workshop at the CPDP Conference (Computers, Privacy & Data Protection) which had three main themes of addressing the Ethical and Societal challenges of digital technologies, considering Computer Science solutions against the misuse of technologies, and Technological approaches to prevent Black Mirror's dystopian future. The programme was made up of several sessions of presentations, followed by a group activity to design your own dystopian episode of Black Mirror. These events were all run consecutively so it was possible to attend each talk and take part in the group activities.

Members of AI3SD attended this event, networked with the attendees and produced a report.

This event led to locating new connections and collaborations at Nottingham University.
Year(s) Of Engagement Activity 2019
URL https://kmitd.github.io/recoding-black-mirror/cpdp2019.html
 
Description Report on Centre of Machine Intelligence Showcase, 26th October 2018, University of Southampton 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Centre of Machine Intelligence Showcase, 26th October 2018, University of Southampton. Showcase of many University research interest in Machine Intelligence and IoT etc
Year(s) Of Engagement Activity 2019
URL https://www.cmi.ecs.soton.ac.uk/who-we-are
 
Description Semantics Conference, 10-13th September 2018, Vienna Technical University (Report on meeting attended) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Report on the Semantics Conference, 10-13th September 2018, Vienna Technical University
Year(s) Of Engagement Activity 2018
URL https://2018.semantics.cc/
 
Description Semantics and Knowledge Learning for Chemical design 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact This event was run by AI3SD and was designed to explore the different aspects of using semantic web technologies in the chemical space, and to promote discussions about why these technologies are so important and the different ways in which they can be used. The event was a full day, hosted at the Solent Conference Centre in Southampton. The programme was made up of a number of presentations, starting with the importance of provenance of data and a history of the semantic web over the last 20 years with some details of where we are now. The event then progressed to some more specific talks about using semantic web technologies in the chemical domain, to make predictions and to model different aspects of chemistry in ontologies. These presentations were all run consecutively so it was possible to attend each talk, and the final order of the day was an expert panel made up of the speakers. There was plenty of time for networking, as there was both a lunch and drinks session included as part of the day.
Year(s) Of Engagement Activity 2019
URL https://eprints.soton.ac.uk/432447/