# Structure Determination from X-ray Scattering: Parameter extraction from cosmology for nanobiology

Lead Research Organisation:
University of Edinburgh

Department Name: Sch of Physics and Astronomy

### Abstract

Conventional medical examinations use the absorption of X-rays to yield information about the structure of biological tissues. The measurement of X-rays scattered in this process can however yield a richer wealth of structural information. Characterising molecular structures on very small scales - fractions of a micrometre - are now possible, and this can now be done repeatedly on short timescales, as short as micro- or milliseconds. This sort of experiment presents a number of challenges in analysis. One is how to characterise the structures in the first place. In typical cases the characterisation will be complex, with a fairly large numbers of components to be determined from the X-ray data. This can be very demanding computationally, and real-time processing is not generally feasible. The results of the experiment are then only determined after it has finished. A reasonable goal would be to have a fast enough analysis method that real-time processing is possible, opening up the possibility of taking further data immediately. One can certainly envisage many situations where manipulation of a sample or a change in conditions could be a great advantage. A second challenge is that the data collected may be rather poor quality in the signal may be rather small in comparison with background 'noise'; indeed in some circumstances one may wish to reduce the exposure to X-rays to avoid damage, in which case the data will become more noisy. Picking out faint signals in these circumstances can be difficult. It is essential to know how small a signal can be reliably extracted from the data, as this might allow smaller X-ray exposure which for biological samples may be critical to avoid damage. Finally, the quantity of data to be analysed can lead to a bottleneck. Conceptually similar problems are encountered in Astronomy, and are generally tackled via methods which are firmly rooted in Bayesian probability theory. In cosmology in particular, researchers strive to extract as much information as possible from their data, which in many cases are substantial in size, very noisy, and may depend on a relatively large number of model parameters. Several techniques used in this field may be applied to the materials characterisation problem. These include very rapid methods to find: the best-fitting solution; how uncertain the solution is; whether the solution is unique or not. In addition to commonly-applied tools, the PI holds a patented algorithm called MOPED, which can do this sort of task extremely rapidly indeed. For some problems, acceleration by several orders of magnitude has been achieved. If MOPED is applicable to these problems, then real-time analysis could be within our reach.

### Technical Summary

X-ray scattering can potentially yield rich information about the molecular structure of materials. Synchrotron sources now allow microdiffraction, probing scales of 100-300 nm. Furthermore time-domain studies sampling micro- to miilisecond changes are now possible. This wealth of data leads to a bottleneck in analysis - real-time analysis is virtually impossible and the success of the experiment is often known only after it has finished. We propose to adapt robust statistical methods of parameter estimation and model testing, which are used widely in cosmology, to improve the accuracy, speed and robustness of the analysis of X-ray scattering. There are various methods employed to model the materials. In each case the forward modelling can be done - i.e. for given structural components, the X-ray data can be predicted (give or take noise). Thus the problem is a classic inverse problem - what are the most likely structural parameters given the data? In astrophysics, Monte Carlo Markov Chains are now used fairly routinely for this sort of problemn, allowing very rapid determination of the most likely solution, and quantifying errors in the given structural parameters. We also intend to use Bayesian evidence where possible to determine how many structural elements are required or demanded by the data. These problems are computationally very demanding because the parameter space is large. We intend to investigate the use of the patented MOPED algorithm to accelerate calculations. MOPED was developed for the analysis of galaxy spectra by the PI, and in that context offers acceleration by two orders of magnitude. Finally, if time allows, we intend to investigate alternative modelling of the data which may be more flexible than some existing methods, through Gaussian mixture models.

## People |
## ORCID iD |

Alan Heavens (Principal Investigator) | |

Timothy Wess (Co-Investigator) |