Accelerating Theoretical Spectroscopy through Machine-learning

Lead Research Organisation: University of Warwick
Department Name: Physics

Abstract

Spectroscopy techniques are some of the most powerful experimental tools we have for learning about crucial processes involving energy and charge transfer at the nanoscale. Such processes are central to the function of practically all electronic devices, eg semiconductors, batteries, and photovoltaics, and of organic and biological systems, such as light-harvesting complexes enabling photosynthesis. Spectroscopic tools are becoming ever more sophisticated and their energy, spatial and temporal resolution are continually improving. For example, Warwick hosts the Warwick Centre for Ultrafast Spectroscopy, a state-of-the-art femtosecond laser facility with experiments ranging from the uv to the THz regime.

However, as tools become more sophisticated it becomes ever-harder to interpret their results. This is where quantum mechanical computational modelling techniques can be enormously beneficial. Theoretical spectroscopy predicts and explains the absorption and emission spectra of materials and molecules. From these, we can for example routinely predict the colour, of a dye purely from its molecular structure and environment. Theoretical spectroscopy can also help understand photostability and reactions initiated by light absorption such as those that cause dyes to degrade in UV light, or what causes DNA damage leading to skin cancer. While state-of-the-art quantum mechanical simulation tools can deliver the properties of electronic excited states, they are very computationally expensive, and ab initio molecular dynamics on excited-state energy surfaces is not feasible for describing processes over timescales of anything more than a few picoseconds. Furthermore, large, complex systems such as heterogeneous interfaces, solvated molecules and biomolecules are a great challenge to traditional methods, whose scaling with system size can be poor.

To address these challenges, research by a UK-based team over the last decade has produced a robust implementation of Linear-Scaling Density Functional Theory in the form of the ONETEP code (www.onetep.org), which can perform quantum mechanical calculations at an unprecedented scale. Recently, Hine and co-workers at Warwick have added functionality for theoretical spectroscopy, including optical absorption, Electron Energy Loss Spectroscopy, and vibration spectroscopy.

In this PhD project we will take these capabilities to a new level by coupling them with advanced machine-learning techniques, potentially accelerating both electronic structure calculations themselves, and dynamics based on potential energy surfaces derived from them. QM simulations can provide training data to a machine learning packages enabling them to learn the excited state potential energy surfaces of a system. A neural-network representation takes the vector of input coordinates, forms molecular descriptors, and outputs the energies and forces required for dynamics, much more rapidly than any QM-based method possibly could. This enables rapid calculation of molecular dynamics trajectories, helping to understand the "fate" of photoexcited molecules in complex environments. Further possible application areas for ML techniques involve directly deriving an "optimal" set of local orbitals for a system from the positions of nearby atoms, accelerating otherwise costly computational optimisation.

The goals of this project are closely aligned with a range of sub-themes in the EPSRC Physical Sciences portfolio (Chemical reaction dynamic, chemical structure, light-matter interactrions, electronic structure and theoretical chemistry) and with the Artificial Intelligence and Research Infrastructure (Software Engineering) themes.

Planned Impact

Impact on Students. The primary impact will be on the 50+ PhD students trained by the Centre. They will be high-quality computational scientists who can develop and implement new methods for modelling complex systems in collaboration with scientists and end-users, who are comfortable working in interdisciplinary environments, have excellent communication skills and be well prepared for a wide range of future careers. The students will tackle and disseminate results from exciting PhD projects with strong potential for direct impact. Exemplar research themes we have identified together with our industrial and international partners: (i) design of electronic devices, (ii) catalysis across scales, (iii) high-performance alloys, (iv) direct drive laser fusion, (v) future medicine exploration, (vi) smart nanofluidic interfaces, (vii) composite materials with enhanced functionality, (viii) heterogeneity of underground systems.

Impact on Industry. Students trained by HetSys will make a significant impact on UK industry as they will be ideally prepared for R&D careers to help to address the skills shortage in science and engineering. They will be in high demand for their ability to (i) work across disciplines, (ii) perform calculations that come along with error estimates, and (iii) develop well-designed software that other researchers can readily use and modify which implements novel solutions to scientific problems. More generally, incorporating error bars into models to take account of incomplete data and insufficient models could lead to significantly enhanced adoption of materials modelling in industry, reducing trial and error, and costly/time-consuming R&D procedures. The global market for simulation software is expected to more than double from now to 2022 indicating a very strong absorptive capacity for graduates. Moreover, a recent European Materials Modelling Consortium report identified a typical eight-fold return on investment for materials modelling research, leading to cost savings of 12M Euros per industrial project.

Impact on Society. Scarcity of resources and high energy requirements of traditional materials processing techniques raise ever-increasing sustainability concerns. Limitations on jet engine fuel efficiency and the difficulties of designing materials for fusion power stations reflect the social and economic cost of our incomplete knowledge of how complex heterogeneous systems behave. High costs of laboratory investigations mean that theory must aid experiment to produce new knowledge and guidance. By training students who can develop the new methodology needed to model such issues, HetSys will support society's long term need for improved materials and processes.

There will also be a direct impact locally and regionally through engagement by HetSys in outreach projects. For example we will encourage CDT students to be involved with annual 'Inspire' residential courses at Warwick for Year 11 girls, which will show what STEM subjects are like at degree level. CDT students will present highlights from projects to secondary-school pupils during these courses and also visit local schools, particularly in areas currently under-represented in the student body, in coordination with relevant professional bodies.

Impact on collaboration. Our international partners have identified the same urgent challenges for computational modelling. We will build flourishing links with research institutes abroad with long term benefit on UK research via our links to computational science networks. Shared research projects will strengthen links between academic staff and industry R&D personnel and across disciplines. The work will also lead to accessible, robust and reusable software. The Centre will achieve cross-disciplinary academic impact on the physical and materials sciences, engineering, manufacturing and mathematics communities at Warwick and beyond, and on the generation of new ideas, insights and techniques.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S022848/1 01/04/2019 30/09/2027
2228390 Studentship EP/S022848/1 01/10/2019 05/07/2024 Carlo Maino