Large scale spatio-temporal point processes: novel machine learning methodologies and application to neural multi-electrode arrays.

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Large scale spatio-temporal data sets are becoming increasingly available due to progress in data gathering technology. In this proposal we are concerned with event-based data: data points are spatial and temporal coordinates of an event, as opposed to analogue measurements of a variable. Such data is pervasive in a number of applications, ranging from epidemiology to social sciences, and poses considerable computational issues: the data is intrinsically high dimensional (indeed infinite dimensional if working in a continuous time framework) and nonlinear, and it is often a noisy observation of complex dynamical processes. Scalable data modelling solutions for this data type are urgently needed, and will require novel research in computational statistics and machine learning.
In this proposal, we will address fundamental methodological problems motivated by an application of great relevance in a biomedical scenario: recordings of neural electrical activity by High Density Multi Electrode Arrays (HD-MEA). These are electronic chips with many (>1000) recording channels, which are used to measure electrical activity in a range of in vitro preparations, and enable simultaneous measurement of the spiking activity of thousands of neurons. This novel technology (commercially developed within the last five years) has the potential to enable scientists to answer fundamental questions on how neurons communicate between each other, as well as having direct translational potential as an effective tool to test in vitro the impact of drug treatment over neuronal function. Providing data modelling tools for such data is challenging: HD-MEA recordings are a prime example of big data (data rates of ~3GB/minute) where complex behaviours rule out simple scalable models.
In this exciting multi-disciplinary project, we propose to treat an HD-MEA data set as a realisation of a spatio-temporal point-process (a random set of points, i.e. neural spikes), and use and develop techniques from Bayesian statistics and machine learning to infer salient dynamical properties of the biophysical process underlying the data. The major challenges which will be addressed are concerned with devising statistical machine learning methods which can accommodate non-linearities and that can scale to the large size of HD-MEA data, while still giving biologically meaningful insights. In particular, we will focus on determining from data the connectivity of the network of neurons, neuron-intrinsic dynamics, and how chemical (e.g. drugs administered to the culture) and other stimuli influence the electrical response and network properties of the culture. Addressing these challenges will entail considerable work on approximate Bayesian inference for large-scale spatio-temporal point processes, generating methodologies which will be general and applicable to many other domains of science and engineering.
The project brings together the machine learning and systems biology expertise of the PI and named RA, the neuroscience expertise of the coI as well as strong collaborative ties with international experimental groups and industrial players, making the team of researchers ideally suited to tackle this challenging project.

Planned Impact

This project will achieve impact both from delivering widely applicable methodological advances, and through an important application of direct biomedical significance:
- academic beneficiaries will include researchers in computational statistics, machine learning and computational neuroscience, as well as people working in closely related disciplines and applications. The major channels for achieving such impact will be publications, attendance and presentation at conferences/ workshops, and organisation of workshops.
- industrial beneficiaries will include the project industrial partner 3Brain Gmbh, who will directly benefit from the analytical methodologies developed in the project, as well as companies interested in collection and analysis of spatio-temporal data (a burgeoning field involving some major global players such as Google and Microsoft). In the longer term, the improved analytical tools will increase the usefulness of the HD-MEA technology in drug testing and medical research, leading to possible long-term impact on the pharma sector. Impact on the direct application will be achieved through the planned collaboration with the project partner. Broader methodological impact will be achieved through nurturing existing contacts (e.g. the applicant currently holds funding from Microsoft) and through presenting at machine learning conferences which are usually well attended by industrial researchers.
- societal beneficiaries include biomedical practitioners and the general public, who in the long term are likely to indirectly benefit from the availability of strong drug prototyping technologies, as well as from the insights in fundamental neuroscience we are likely to obtain. Furthermore, this research uses mathematical tools to address fundamental question on the working of neural cells which are likely to appeal to the general public in outreach efforts. We will pursue this avenue by participating and instigating outreach activities within the School of Informatics, the University of Edinburgh and more generally by exploiting high profile Edinburgh-based events such as the science festival.
 
Description The project proposed novel models to understand the spiking behaviour of neural cells from a data driven perspective
Exploitation Route Models produced in this project might be useful in discovering patterns in neural data which can elucidate responses of neurons to drugs/ disease
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology