Deep Learning with Limited Data for Battery Materials Design
Lead Research Organisation:
UNIVERSITY COLLEGE LONDON
Department Name: Chemistry
Abstract
The discovery and design of new materials is critical for advancing the state-of-the-art in batteries, which in turn are required for advancing a range of carbon-emission reducing technologies such as renewable energy and electric vehicles. Experimental discovery of new materials is typically slow and costly, quantum mechanics (QM) calculations have brought computational materials design within reach. However, QM calculations are often limited to relatively small sets of materials, as their computational costs are too great for large-scale screening, this is the case for calculating properties required for new battery materials. New methods in machine learning (ML) have emerged as a powerful complementary tool to QM calculations - learning rules from data calculated from QM and applying cheap, efficient models to explore large chemical spaces. However, these ML models have hitherto been restricted to instances where relatively large datasets of QM properties (tens of thousands or more instances) are available for training the ML, thus limiting their utility. In this project we will combine the expertise of our two groups (ML for materials design and computational modelling of battery materials) to tackle this important issue by using the approach of transfer learning (TL). In TL a prior model trained on a large dataset but on an apparently different problem, is used as a foundation to learn on a new, smaller dataset of direct relevance to the battery problem. TL has been transformative in many other fields and with this project we aim to bring this potential to materials design in general and battery materials in particular.
Publications
Devi R
(2024)
Optimal pre-train/fine-tune strategies for accurate material property predictions
in npj Computational Materials
| Description | We established a protocol for developing deep learning models that can accurately predict the properties of new materials - despite there being a scarcity of data for these properties. Typically highly accurate deep learning models require very large amounts of data for training. This is a bottleneck for many important properties in materials science as there are simply not enough training data examples to learn from. We developed a new protocol of transfer learning - a procedure where one trains the deep learning model on a large available dataset and then 'fine-tunes' this model on a smaller, but related dataset which pertains to the property of interest. Specifically we showed how training the original model on many properties simultaneously resulted in more generalisable models that more easily adapt to new small datasets. This work was published and an open source code associated with is is available online. We think that this could have a big impact bringing deep learning methods to many previously unsuitable materials science datasets. In particular we are interested in applying this approach to properties of interest for battery materials. |
| Exploitation Route | This can be used in many areas of science where large, labelled datasets are hard to obtain. For example it can be used in discovery of new quantum materials in condensed matter physics or new herbicides in organic chemistry. |
| Sectors | Chemicals Energy Pharmaceuticals and Medical Biotechnology |
| URL | https://www.nature.com/articles/s41524-024-01486-1 |
| Title | Machine Learning Accelerated Monte Carlo for Small Angle X-ray Scattering |
| Description | This is a Python code using Markov Chain Monte Carlo (MCMC) sampling with a neural network surrogate model to predict colloidal interaction parameters (effective charge and Debye length) from a Small Angle X-ray Scattering (SAXS) profile. |
| Type Of Material | Data analysis technique |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | None yet, but we are collaborating with others to implement this in an autonomous lab setup. |
| URL | https://github.com/mdi-group/saxs-mcmc/ |
| Title | Negative Muon Spectroscopy Data for Ag-Al-Au Alloys |
| Description | Negative muon spectroscopy data for Ag/Al/Au alloys. The data is generated by mixing elemental spectra of each of the species in randomly selected ratios. The underlying physical data was collected at the ISIS Neutron and Muon Source. The data is assocaited with the manuscript 'Enhancing Performance of Multilayer Perceptrons by Knot-Gathering Initialization'. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This dataset has been used in several training courses provided by us and the Scientific Computing Department at STFC. |
| URL | https://zenodo.org/doi/10.5281/zenodo.13862951 |
| Title | Negative Muon Spectroscopy Data for Ag-Al-Au Alloys |
| Description | Negative muon spectroscopy data for Ag/Al/Au alloys. The data is generated by mixing elemental spectra of each of the species in randomly selected ratios. The underlying physical data was collected at the ISIS Neutron and Muon Source. The data is assocaited with the manuscript 'Enhancing Performance of Multilayer Perceptrons by Knot-Gathering Initialization'. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This dataset has been used in several training courses provided by us and the Scientific Computing Department at STFC. |
| URL | https://zenodo.org/doi/10.5281/zenodo.13862952 |
| Title | Negative Muon Spectroscopy Data for Ag-Al-Au Alloys |
| Description | Negative muon spectroscopy data for Ag/Al/Au alloys. The data is generated by mixing elemental spectra of each of the species in randomly selected ratios. The underlying physical data was collected at the ISIS Neutron and Muon Source. The data is assocaited with the manuscript 'Enhancing Performance of Multilayer Perceptrons by Knot-Gathering Initialization'. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This dataset has been used in several training courses provided by us and the Scientific Computing Department at STFC. |
| URL | https://zenodo.org/doi/10.5281/zenodo.14277749 |
| Description | Machine Learning School In IISc Bangalore |
| Organisation | Indian Institute of Science Bangalore |
| Country | India |
| Sector | Academic/University |
| PI Contribution | We developed the materials and delivered a series of lectures to approximately 50 students at IISc Bangalore. |
| Collaborator Contribution | IISc Bangalore provided the local organisation - venue, catering registration etc for the school. |
| Impact | We have developed a publicly accessible repository of lecture notes and interactive Python notebooks. We will also soon publish the lecture videos online. |
| Start Year | 2024 |
| Title | Transer Learning for Materials Property Prediction |
| Description | This repository contains the pre-trained and fine-tuned atomistic line graph neural network (ALIGNN) architectures and the corresponding results for the calculations carried out as a part of the research paper titled "Optimal pre-train/fine-tune strategies for accurate material property predictions". The paper has been published on npj Computational Materials. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | We are currently using this model to identify new materials for use as battery cathodes. |
| URL | https://github.com/sai-mat-group/transfer-learning-material-properties |
| Description | Machine Learning School in IISc Bangalore |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Approximately 50 students from academia and industry from across India attended the school. The school was intended to expose a wider battery research community to techniques from machine learning and how they could be applied for materials design. |
| Year(s) Of Engagement Activity | 2025 |
| URL | https://github.com/mdi-group/iisc-ml-school/ |
