Scalable and Exact Data Science for Security and Location-based Data
Lead Research Organisation:
Lancaster University
Department Name: Mathematics and Statistics
Abstract
Incredible technological advances in data collection and storage have created a world in which we are constantly generating data. From supermarket loyalty cards and social media posts to healthcare records and credit card transactions, a digital footprint exists for every aspect of our lives. The ability of data science to analyse and act upon these complex and varied data sources has the potential to improve and revolutionise our lives in a myriad of ways, for example, through the development of driverless cars and personalised medicine.
The great challenge of data science lies in the trade-off between the speed and accuracy with which large volumes of data can be analysed and acted upon within complex data environments. Extracting deeper knowledge from data requires increasingly sophisticated mathematical models. However, applying such models introduces significant computational constraints, forcing data scientists to rely upon simpler models or approximate inference tools.
In collaboration with strategic partners, this project will bring together industry experts to investigate new approaches to data science driven by fundamental challenges in modelling and analysing large-scale spatial and security data. The data and issues within this domain are highly-significant to modern society as they cover, for example, issues pertaining to fraud detection and computer hacking, as well as understanding and predicting human behaviour within a Smart City environment.
Novel mathematical advances in computational statistics and machine learning will be developed to produce scalable techniques for applying sophisticated mathematical models to large-scale heterogeneous and structured data sources. A key component of this project is reproducibility through the creation of open-source software. These tools will allow data scientists to implement research outcomes to extract key features from complex data and make decisions with high accuracy under uncertainty.
The great challenge of data science lies in the trade-off between the speed and accuracy with which large volumes of data can be analysed and acted upon within complex data environments. Extracting deeper knowledge from data requires increasingly sophisticated mathematical models. However, applying such models introduces significant computational constraints, forcing data scientists to rely upon simpler models or approximate inference tools.
In collaboration with strategic partners, this project will bring together industry experts to investigate new approaches to data science driven by fundamental challenges in modelling and analysing large-scale spatial and security data. The data and issues within this domain are highly-significant to modern society as they cover, for example, issues pertaining to fraud detection and computer hacking, as well as understanding and predicting human behaviour within a Smart City environment.
Novel mathematical advances in computational statistics and machine learning will be developed to produce scalable techniques for applying sophisticated mathematical models to large-scale heterogeneous and structured data sources. A key component of this project is reproducibility through the creation of open-source software. These tools will allow data scientists to implement research outcomes to extract key features from complex data and make decisions with high accuracy under uncertainty.
Planned Impact
This research agenda is designed to address the significant topical challenges of modern data science which impede its applicability within complex data environments. Through close engagement with industrial stakeholders, this research will produce a transformative approach to analysing large-scale heterogeneous data in the areas of spatio-temporal modelling and security & defence applications.
This project is supported by an impressive array of committed partners: Prowler.io, The Heilbronn Institute of Mathematical Research (HIMR) and The Alan Turing Institute (ATI), who provide significant expertise in the areas of security and spatio-temporal modelling. Through an integrative research programme with the project partners, key research outcomes will be tested and deployed on the data and systems owned by these partners, providing real-world verification of the applicability of research outputs.
Through the co-design and implementation of research objectives with project partners, the scalable data science tools created under this fellowship will contribute to the knowledge economy of the UK, by enabling researchers and practitioners to employ complex mathematical models to previously prohibitively high-dimensional data sources. Key engagements with HIMR will support the application of this research to address imperative national security challenges.
Open-source software will be developed stemming from research outcomes. This will support the far-reaching impact of this work beyond the academic community, providing tools for end-users to freely implement on a wide variety data sources beyond the security and spatio-temporal domains. This will become part of the core toolbox for both public and private sector organisation seeking to fit complex models to large data.
This project is supported by an impressive array of committed partners: Prowler.io, The Heilbronn Institute of Mathematical Research (HIMR) and The Alan Turing Institute (ATI), who provide significant expertise in the areas of security and spatio-temporal modelling. Through an integrative research programme with the project partners, key research outcomes will be tested and deployed on the data and systems owned by these partners, providing real-world verification of the applicability of research outputs.
Through the co-design and implementation of research objectives with project partners, the scalable data science tools created under this fellowship will contribute to the knowledge economy of the UK, by enabling researchers and practitioners to employ complex mathematical models to previously prohibitively high-dimensional data sources. Key engagements with HIMR will support the application of this research to address imperative national security challenges.
Open-source software will be developed stemming from research outcomes. This will support the far-reaching impact of this work beyond the academic community, providing tools for end-users to freely implement on a wide variety data sources beyond the security and spatio-temporal domains. This will become part of the core toolbox for both public and private sector organisation seeking to fit complex models to large data.
Organisations
- Lancaster University (Fellow, Lead Research Organisation)
- Elsevier (Collaboration)
- Alan Turing Institute (Collaboration)
- Queensland University of Technology (QUT) (Collaboration)
- Prowler.io (Collaboration, Project Partner)
- Stanford University (Collaboration)
- The Alan Turing Institute (Project Partner)
- Heilbronn Institute for Mathematical Research (Project Partner)
Publications
Coullon J
(2022)
SGMCMCJax: a lightweight JAX library for stochastic gradient Markov chain Monte Carlo algorithms
in Journal of Open Source Software
Coullon J
(2021)
Ensemble sampler for infinite-dimensional inverse problems
in Statistics and Computing
Coullon J
(2022)
Markov chain Monte Carlo for a hyperbolic Bayesian inverse problem in traffic flow modeling
in Data-Centric Engineering
Coullon J
(2020)
Ensemble sampler for infinite-dimensional inverse problems
Coullon J
(2023)
Efficient and generalizable tuning strategies for stochastic gradient MCMC
in Statistics and Computing
Fairbrother J
(2022)
GaussianProcesses.jl : A Nonparametric Bayes Package for the Julia Language
in Journal of Statistical Software
Nemeth C
(2018)
Merging MCMC Subposteriors through Gaussian-Process Approximations
in Bayesian Analysis
Nemeth C
(2019)
Stochastic gradient Markov chain Monte Carlo
Nemeth C
(2021)
Stochastic Gradient Markov Chain Monte Carlo
in Journal of the American Statistical Association
Nemeth C.
(2019)
Pseudo-extended markov chain monte carlo
in Advances in Neural Information Processing Systems
Description | The original goal of this research was to create a suite of new algorithms that can handle large-scale data sources. In particular, data sources with complex structures, such as network data. The software and publications generated from this work have advanced the field in the area of scalable inference and we have shown through this grant that algorithms, such as stochastic gradient MCMC, can be applied widely across many scientific disciplines to address the challenge that researchers face when trying to perform Bayesian inference with large datasets. Many real-world datasets need to be either subsampled or are stored across multiple databases and this research has shown how it is practically possible to perform parameter estimation in these settings. |
Exploitation Route | The software developed during the grant should be maintained to ensure that researchers in other disciplines have the opportunity to utilise this work for their purposes. Further work is needed to collaborate with researchers in other fields to help them transition towards using these new inference tools. |
Sectors | Environment Security and Diplomacy |
Description | This grant has led to significant developments in scalable inference for Bayesian models. The development of new algorithms, particularly in the area of stochastic gradient MCMC, has been very influential in the model fitting for Bayesian neural networks (BNNs). Stochastic gradient MCMC algorithms are now the default MCMC algorithm used for large-scale BNNs and are used by both academic and industry practitioners. The software developed as part of this grant has extended the reach of these algorithms and provided new tools for practitioners to use and build upon the algorithms developed through this grant. |
First Year Of Impact | 2022 |
Sector | Environment,Other |
Description | Detecting soil degradation and restoration through a novel coupled sensor and machine learning framework |
Amount | £934,689 (GBP) |
Funding ID | NE/T012307/1 |
Organisation | Natural Environment Research Council |
Sector | Public |
Country | United Kingdom |
Start | 01/2020 |
End | 12/2024 |
Description | Explainable AI for UK agricultural land use decision-making |
Amount | £43,151 (GBP) |
Funding ID | NE/T004002/1 |
Organisation | Natural Environment Research Council |
Sector | Public |
Country | United Kingdom |
Start | 12/2019 |
End | 07/2020 |
Description | Scalable Monte Carlo in the General Big Data Setting. |
Amount | £120,000 (GBP) |
Funding ID | 1949442 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2017 |
End | 05/2022 |
Description | Turing AI Fellowship: Probabilistic Algorithms for Scalable and Computable Approaches to Learning (PASCAL) |
Amount | £1,097,294 (GBP) |
Funding ID | EP/V022636/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2021 |
End | 12/2025 |
Title | Stochastic Gradient Markov Chain Monte Carlo |
Description | Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large datasets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this article, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilizes data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online athttps://github.com/chris-nemeth/sgmcmc-review-paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://tandf.figshare.com/articles/dataset/Stochastic_gradient_Markov_chain_Monte_Carlo/13221999/3 |
Title | Stochastic Gradient Markov Chain Monte Carlo |
Description | Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large datasets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this article, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilizes data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online athttps://github.com/chris-nemeth/sgmcmc-review-paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://tandf.figshare.com/articles/dataset/Stochastic_gradient_Markov_chain_Monte_Carlo/13221999 |
Title | Stochastic Gradient Markov Chain Monte Carlo |
Description | Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large datasets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this article, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilizes data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online athttps://github.com/chris-nemeth/sgmcmc-review-paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://tandf.figshare.com/articles/dataset/Stochastic_gradient_Markov_chain_Monte_Carlo/13221999/2 |
Title | Stochastic gradient Markov chain Monte Carlo |
Description | Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large data sets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this paper, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilises data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online1. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://tandf.figshare.com/articles/dataset/Stochastic_gradient_Markov_chain_Monte_Carlo/13221999/1 |
Description | Alan Turing Institute (ATI) collaboration |
Organisation | Alan Turing Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I worked with PI Chris and collaborators on writing, methods development and theory for a paper that has now been submitted to a top-tier statistical journal. |
Collaborator Contribution | The specific collaborators from the Alan Turing Institute involved in this project include Chris Oates, Toni Karvonen and Mark Girolami. Their contributions were in funding two research visits to the Alan Turing Institute and jointly working on writing the manuscript, methods development and theory with myself and PI Chris. |
Impact | This collaboration has resulted in a computational statistics (single discipline) research paper submitted to a top-tier journal in the area. |
Start Year | 2018 |
Description | Collaboration with Dr Leah South |
Organisation | Queensland University of Technology (QUT) |
Country | Australia |
Sector | Academic/University |
PI Contribution | My team and I are meeting with Dr South on a weekly basis to prepare a research project for academic publication. |
Collaborator Contribution | Dr Leah South is advising our project on the application of Stein's method within the context of stochastic gradient MCMC |
Impact | Currently in development |
Start Year | 2020 |
Description | Data subsampling for scalable inference |
Organisation | Stanford University |
Country | United States |
Sector | Academic/University |
PI Contribution | Monte Carlo methods are often required to produce exact inference and to evaluate models in the Bayesian setting. These algorithms are widely implemented by scientists and industrial practitioners, due to their versatility and strong theoretical properties. Unfortunately, standard Monte Carlo algorithms are ill-suited for conducting inference on large datasets. This is because they require complete evaluations of the full data at each iteration, leading to a computational cost that increases (at the very least) proportionally with the data size. These issues have prompted considerable interest amongst the machine learning and statistics communities to develop Bayesian inference methods which can scale easily in relation to the size of the data. The project has developed new scalable Markov chain Monte Carlo (MCMC) algorithms based on stochastic gradient MCMC. In particular, we have developed new techniques for modelling temporally-varying data and new ways to optimally subsample data which leads to lower variance stochastic gradient estimates. |
Collaborator Contribution | This project has been in collaboration with Prof Emily Fox (formerly of the University of Washington). Prof Fox is a world leader in statistical machine learning and her expertise has been invaluable in the development of scalable MCMC techniques in the temporally-evolving setting. |
Impact | Two publications were produced as a result of this collaboration. One paper has been accepted for publication in AISTATS and a second publication is currently under review. |
Start Year | 2018 |
Description | Methods for multimodal sampling |
Organisation | PROWLER.io |
Country | United Kingdom |
Sector | Private |
PI Contribution | Developed a new algorithm for sampling from multimodal posterior distributions |
Collaborator Contribution | Provided new insights and developed software |
Impact | A paper on this work was published in a top AI conference - http://papers.nips.cc/paper/8683-pseudo-extended-markov-chain-monte-carlo |
Start Year | 2018 |
Description | Statistical analysis of multiple interaction data |
Organisation | Elsevier |
Department | Elsevier UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | Elsevier provides various online services and tools for researchers, such as Mendeley and ScienceDirect, and are interested in the problem of user segmentation - understanding who their users are and how they interact with their platforms. Our goal is to develop novel methodologies to assist with this task. Of particular interest is the analysis of clickstream data, which contains information regarding visits of users to Elsevier webpages. The data has two key properties are leveraging. Namely, it is both intermittent and bursty, with cascades of clicks in quick succession followed by periods of inactivity. This has provided a means to interpret this as network data. Using the intermittent and bursty properties of these data, we are able to partition a single user's data into a sequence of paths over webpages. This represents an instance of a so-called interaction network, where one observes interactions amongst entities over time (here entities=webpages and interactions=paths). This differs subtly from the case where relations amongst entities are observed explicitly, such as in traditional social network data, and has led to recent work in the literature on new models. |
Collaborator Contribution | Elsevier has provided data and domain expertise that has assisted in our analysis. We have had regular meetings with Elsevier staff and visits to their offices. These interactions have been invaluable to making progress on this project. |
Impact | Two publications on this work are currently in submission |
Start Year | 2019 |
Description | Statistical network modelling for populations of networks |
Organisation | Elsevier |
Department | Elsevier UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | Developing a tool to cluster researchers who use Elsevier's platforms. |
Collaborator Contribution | Elsevier has provided data and technical expertise which has allowed us to make methodological developments on this project. |
Impact | Ongoing |
Start Year | 2019 |
Title | GaussianProcesses.jl |
Description | Gaussian processes are a family of stochastic processes which provide a flexible nonparametric tool for modelling data. A Gaussian Process places a prior over functions, and can be described as an infinite dimensional generalisation of a multivariate Normal distribution. Moreover, the joint distribution of any finite collection of points is a multivariate Normal. This process can be fully characterised by its mean and covariance functions, where the mean of any point in the process is described by the mean function and the covariance between any two observations is specified by the kernel. Given a set of observed real-valued points over a space, the Gaussian Process is used to make inference on the values at the remaining points in the space. This package allows the user to fit exact Gaussian process models when the observations are Gaussian distributed about the latent function. In the case where the observations are non-Gaussian, the posterior distribution of the latent function is intractable. The package allows for Monte Carlo sampling from the posterior. |
Type Of Technology | Software |
Year Produced | 2015 |
Open Source License? | Yes |
Impact | This software is widely used by the Julia community. |
URL | https://github.com/STOR-i/GaussianProcesses.jl |
Title | SGMCMC R package |
Description | This software implements a host of stochastic gradient MCMC algorithms for fast Bayesian inference. This software has been developed for the R language and is build upon the Google Tensorflow library. Utilising the efficient computation of Tensorflow, and in particular, the automatic differentiation tools available through Tensorflow, this software is the first R package which provides a simple user interface for statistician's to use gradient-based MCMC algorithms, without requiring the gradients to be hand-coded. |
Type Of Technology | Software |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | The software has only recently been released and is yet to achieve its full potential. However, several papers have already cited this software in their work, indicating that it is being used within the community. |
URL | https://github.com/STOR-i/sgmcmc |
Title | SGMCMCJax |
Description | The software provides a toolbox of algorithms for stochastic gradient Markov chain Monte Carlo (MCMC). The package builds on the Jax library to offer users automatic differentiation tools that can be used to create gradient-based MCMC samplers. |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | The software has been used in publications and is part of a new book on probabilistic machine learning written by Kevin Murphy. |
URL | https://github.com/jeremiecoullon/SGMCMCJax |
Title | ZVCV R package |
Description | This R package can be used to implement gradient-based variance reduction techniques, including a method that we developed as part of the grant. The package is on the main R package repository (CRAN) and on GitHub, with the updated version on GitHub to be sent to CRAN in the next month or so. |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | This is the only R package for gradient-based variance reduction techniques that I'm aware of. It has been downloaded over 3500 times. |
URL | https://github.com/LeahPrice/ZVCV |
Description | Alan Turing Institute reading group presentation |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | I gave a tutorial on a parametric alternative to approximate Bayesian computation to the reading group. This sparked more interest in the approach and its theoretical properties. |
Year(s) Of Engagement Activity | 2019 |
Description | CEEDS Seminar: Machine Learning for the Natural Environment |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Other audiences |
Results and Impact | The presentation was designed as an introductory tutorial into machine learning techniques. The focus of the talk was to educate environmental scientists on the available machine learning tools that are applicable for analysing environmental data. |
Year(s) Of Engagement Activity | 2020 |
URL | https://ceeds.ac.uk/blogs/ceeds-seminar-machine-learning-natural-environment |
Description | Invited talk at Bayes4Health & CoSInES workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | My talk on derivative-based variance reduction sparked a discussion with the first author of a journal of the royal statistical society read paper, which led to us submitting a comment on how methods from the talk could be used in their novel application. |
Year(s) Of Engagement Activity | 2019 |
Description | Newcastle University Seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | I gave a talk to the Newcastle University statistics department, which sparked interesting discussion about links to related fields. |
Year(s) Of Engagement Activity | 2020 |
Description | Poster presentation at BayesComp2020 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | I presented a poster on derivative-based variance reduction. I spoke with several people about the work, including one person who was interested in applying the proposed methods to his high-dimensional application. |
Year(s) Of Engagement Activity | 2020 |
URL | http://users.stat.ufl.edu/~jhobert/BayesComp2020/Conf_Website/ |
Description | Presentation at the Royal Statistical Society |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Gave a presentation at an RSS workshop on Bayesian computation for Stein's method. |
Year(s) Of Engagement Activity | 2021 |
URL | https://rss.org.uk/training-events/events/events-2021/sections/rss-applied-probability-and-computati... |
Description | Research presentation at Nottingham University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | Invited to give a seminar at Nottingham University on scalable MCMC algorithms. |
Year(s) Of Engagement Activity | 2020 |
Description | STORi conference 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Postgraduate students |
Results and Impact | Presented a talk at the annual conference for my CDT. This was targeted at current students and partners of the CDT. The conference was designed to provide an opportunity for those involved to get a sense of the wide variety of research being undertaken at my CDT. |
Year(s) Of Engagement Activity | 2019 |
Description | Seminar (University of Manchester) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | Seminar for the University of Manchester statistics, quantification of uncertainty, inverse problems and data science group. Debate and discussion on multimodal methods for Markov chain Monte Carlo algorithms. |
Year(s) Of Engagement Activity | 2019 |
Description | Seminar (University of Oslo) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | A seminar on scalable Markov chain Monte Carlo algorithms was given to the Statistics Department at the University of Oslo. Several interesting discussions stemmed from this talk and a new collaboration. |
Year(s) Of Engagement Activity | 2018 |
Description | Seminar (University of Oxford) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | A seminar was given to the University of Oxford Statistics Department on pseudo-extended MCMC methods. Many interesting questions and discussions followed on from this meeting. |
Year(s) Of Engagement Activity | 2019 |
Description | Seminar, Bocconi University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | The PI gave a seminar at Bocconi University, Milan, Italy. The seminar was attended by approximately 30 people, including academic staff, PhD students and research associates. The talk covered research outputs from this grant pertaining to scalable Monte Carlo inference. Several members of the audience were interested in learning more about this work and incorporating these techniques in their research. |
Year(s) Of Engagement Activity | 2018 |
URL | http://didattica.unibocconi.eu/eventi/event.php?IdPag=5575&dip=55&id=5735&IdFld=265&See= |
Description | Talk at CMStatistics |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | I spoke in an invited session about a novel sequential Monte Carlo method. I had several questions, but also had one participant working in applied areas who was interested in applying this method to their application. |
Year(s) Of Engagement Activity | 2019 |
URL | http://cmstatistics.org/RegistrationsV2/CFE2019/viewSubmission.php?in=401&token=30n7nssqr596820r98p4... |
Description | Talk at Imperial College London |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | This talk was given to approximately 30 people at a seminar for the Statistics group at Imperial College. The talk covered stochastic gradient MCMC methods and how standard methods are inefficient without utilising control variate approaches. |
Year(s) Of Engagement Activity | 2022 |
Description | Talk at Leeds University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | Research talk to the Mathematics department at the University of Leeds. The focus of the talk was on data science for environmental science challenges. In particular, how the environmental data scientists at Leeds could collaborate further with Lancaster University. |
Year(s) Of Engagement Activity | 2022 |
Description | Talk at Monte Carlo methods (MCM) 2019 conference |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | I gave a talk in an invited session about a novel sequential Monte Carlo, which sparked some interest from someone who has developed a new sampler in applying their method within ours. |
Year(s) Of Engagement Activity | 2019 |
URL | http://www.mcm2019.unsw.edu.au/FinalProgram-rotated.pdf |
Description | Talk at an RSS workshop (Reading University) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | Chris Nemeth was invited to give a talk on MCMC methods at the Royal Statistical Society Reading local group event on Bayesian computation. |
Year(s) Of Engagement Activity | 2018 |
Description | Talk at the conference of the International Society of Bayesian Analysis |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This talk covered the recent developments in scalable Markov chain Monte Carlo and many of the pitfalls that exist with current methods. The audience was international and mostly university academics. |
Year(s) Of Engagement Activity | 2022 |
Description | University of New South Wales seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | I gave a seminar to the UNSW statistics department as well as postdocs from across Australia (who were attending for an annual retreat) on derivative-based control variates. After the seminar, I had extensive discussions about the research with a few researchers who were working in related areas. |
Year(s) Of Engagement Activity | 2019 |
URL | http://www.maths.unsw.edu.au/seminars/archive/annual/2019?page=8 |
Description | University of Warwick seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | I was invited to give a seminar at University of Warwick, where I spoke about gradient-based control variates. Several people were interested in discussing the methods in more detail after the talk. |
Year(s) Of Engagement Activity | 2019 |
URL | http://warwick.ac.uk/fac/sci/statistics/news/algorithms-seminars/2018-19/ |