Data-driven Exploration of Metastable Molybdenum Chalcogenides

Lead Research Organisation: University of Oxford
Department Name: OxICFM CDT


The goal of this research is first to understand and second predict the properties of useful materials by means of computer simulations. An example is the capacity
of a material to store and transport lithium, which determines the life and maximum electrical power of a battery made from it.
Specifically, this research will develop Machine Learning (ML) approaches to materials modelling, which falls under the EPSRC area of manufacturing the future with
artificial intelligence. ML is the science of computer algorithms that improve with experience, not the explicit intervention of a human programmer. Modern computer simulations of chemical systems can be highly accurate, whether in predicting useful properties such as Lithium capacity or unravelling the structure of complex crystals. However, the simulations usually involve solving (approximately) the equations of quantum mechanics, which requires prohibitively large computing power for models that exceed roughly 1000 atoms in size. We develop methods that 'learn' from quantum-mechanical data to reach that same level of accuracy in simulations, but with far less computing time. This provides access to length-scales in simulations that correspond much more closely to the situation in real devices, giving us new insights to rationally design and discover better materials.
One of the main themes of the project is the construction of the training database: a set of example chemical compounds that the ML models generalise from to predict the properties of new compounds. We will examine the following outstanding questions:
- What kind of data is required for sufficient learning? Is it enough to show the model lots of disordered liquid atoms for it to generalise to all the possible situations an atom might find itself in, or are e.g. solid crystals important too?
- Training data typically includes hundreds of thousands of examples of atoms in different environments. How do we evaluate their information-content systematically? This is important because computer memory limitations restrict the number of data points that can be used directly for learning, so we must select a subset-ideally the most informative ones.
- How do we tell how accurate the ML model is? It is simple to compare the force on an atom predicted by the model versus the training data, which gives us one measure of the error of the model. However, this error does not seem to correlate well with that of predictions of properties of the bulk material, e.g. how well it conducts heat. We likely to be interested in such properties to design a useful material, so it is important that we have confidence in the values we calculate for them.

As a test case for the ML methods development, we will also study the compounds of the elements Molybdenum and Sulfur. The reasons for choosing these compounds in particular are twofold. Firstly, they are used extensively in the chemical industry as a lubricant, to remove the sulfur from crude oil, and to capture Mercury fumes to stop them escaping into the environment. More recently, Lithium has been found to move freely between the sheets of their layered structure, enabling applications in battery materials. The many uses of Molybdenum Sulfides alone makes understanding and predicting the relationship between their structure and properties important.
Secondly, there has been debate among scientists since 1975 over the true structure of MoS3: whether the Molybdenum atoms arrange themselves in triangles or in long chains. Because of the complexity of the disordered network formed by these units, the available experimental evidence is open to interpretation. Computational studies have been able to provide only limited assistance to date in understanding the experimental results because they cannot use models with enough atoms to correspond well to reality. We hope that the new ML methods will be able to resolve the conundrum, in the process proving both their accuracy and utility.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023828/1 01/04/2019 30/09/2027
2404170 Studentship EP/S023828/1 01/10/2020 30/09/2024 Joe Morrow