Automated Chemical Structure Extraction

Lead Research Organisation: University of Nottingham
Department Name: Sch of Chemistry


Automating chemical structure extraction from literature repositories such as, pdf documents, journals, images, into machine a machine-readable format.

This is designed to extrapolate a wealth of data from an otherwise manual resource that can be used in large quantities to tackle ongoing unoptimized processes within the chemical industry. It also presents the potential for an expansive solution to the currently unsolved reliability problem of converting molecular structures to different formats using Image Recognition. By automating such an untapped resource; this opens opportunities for more data driven optimisation to help in the need for optimising chemical reaction/mechanism steps.

Proposed solution and methodology

Use image recognition to autonomously segment images from journal articles and other resources to collect a wealth of machine-readable instances of molecules. The current proposed methodology is to use an end to end deep learning pipeline to define the stages of segmentation to conversion.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S022236/1 01/10/2019 31/03/2028
2285004 Studentship EP/S022236/1 01/10/2019 30/09/2023 Tevyn Allen