The case time series design: a new tool for big data analysis

Lead Research Organisation: London Sch of Hygiene and Trop Medicine
Department Name: Public Health and Policy

Abstract

Biomedical research has been transformed by the recent development of big data technologies. For instance, the collection of health records in linked electronic databases provide information on demographics, health events, medications, and lifestyle factor for large samples of patients. Similarly, portable devices such as mobile phones provide the possibility to recruit large number of participants, and to collect real-time and geo-located individual-level measurements. While these resources offer the possibility to answer research questions that could not be feasibly addressed using traditional studies, they require innovative analytical approaches.
This project contributes to address this issue through the development of a novel analytical method called case time series design. This design combines features of existing approaches to generate a more adaptable tool, particularly well-suited for the analysis of highly-informative big data resources. The case time series design is applicable in different research areas for investigating health effects associated with various risk factors, such as environmental exposures, clinical conditions, or drug use. The research proposal is structured in a detailed work plan that includes the methodological development of the case time series design and examples that illustrate its application in various research areas within biomedical research.

Technical Summary

Big data technologies are transforming the landscape of biomedical research. Epidemiological investigations can now rely on repeated measurements collected at individual level in large populations, linked with a wealth of data on personal characteristics and various risk factors. These resources offer the possibility to answer research questions that could not be feasibly addressed using traditional studies. However, this more complex big data setting presents important methodological challenges, and it requires innovative analytical approaches to model complex longitudinal associations between repeated measures of health outcomes and time-varying exposures using observational data.
This project contributes to address this issue through the development of a novel analytical tool called case time series design. This design offers an adaptable framework that combines the individual-level setting and ability to control for confounders of case-only methods such as the self-controlled case series, with the flexibility and temporal structure of time series models. It represents a general tool, applicable in different research areas for investigating associations with environmental exposures, clinical conditions, or drug use. The case time series design is suitable for the analysis of highly-informative big data resources, particularly those providing individual profiles with longitudinal measures of health outcomes and time-varying predictors.
The research proposal is structured in a detailed work plan. This includes the definition of the design setting and assumptions of the case time series, the description of the statistical framework, and simulation studies to validate the methodology. The novel design is illustrated in four case studies that demonstrate its application in various research areas within biomedical research, such as environmental, clinical, and pharmaco-epidemiology.

Planned Impact

There are several reasons to expect a strong impact from the proposed project. Specifically:
- A detailed publication plan of peer-reviewed contributions, illustrating methodological developments and substantive analyses, targeting high-impact journals, selected with the aim of maximizing the impact and widening the potential audience, and a preference for open access options;
- The flexibility and generality of the case time series methodology, applicable in a variety of settings and research areas, as demonstrated by the four case studies;
- The collaboration network including researchers with an established experience in various research fields, and a strong expertise in both methodological developments and applications in real-data analyses;
- The implementation of the analytical techniques in freely available statistical programs, which will facilitate the independent application by other research teams;
- The provision of tutorials, software documentation and other material about the use of these novel analytical tools;
- The multidisciplinary background and track record of the applicant and his research team, and their experience in developing high-impact analytical techniques and in implementing them in well documented software.

Publications

10 25 50
publication icon
Armstrong BG (2020) Sample size issues in time series regressions of counts on environmental exposures. in BMC medical research methodology

publication icon
Yu J (2020) Seasonality of suicide: a multi-country multi-community observational study. in Epidemiology and psychiatric sciences

 
Description Modern methods for time series analysis
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Temperature, climate change and health
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Use of DLNMs in temperature-health studies
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
URL http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/10kxh85C77hGm5PFkch6qfQ/
 
Title Personal website 
Description The website provides access to the outputs of my research, such as pdf versions and supplemental material of the published papers, summaries and updates of my research activity, and other information. In particular, scripts and data for reproducing the results of methodological or substantive papers are made available thorough the website. 
Type Of Material Improvements to research infrastructure 
Year Produced 2012 
Provided To Others? Yes  
Impact The website is visited by 5-10 visitors each day. They download materials such as articles, scripts and data. 
URL http://www.ag-myresearch.com/
 
Title R package mvmeta 
Description The package contains functions and data examples for running extended meta-analytical models. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. 
URL http://cran.r-project.org/web/packages/mixmeta/index.html
 
Title R package mvmeta 
Description The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Material Improvements to research infrastructure 
Year Produced 2011 
Provided To Others? Yes  
Impact The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. 
URL http://cran.r-project.org/web/packages/mvmeta/index.html
 
Title Personal website 
Description The website provides access to the outputs of my research, such as pdf versions and supplemental material of the published papers, summaries and updates of my research activity, and other information. In particular, scripts and data for reproducing the results of methodological or substantive papers are made available thorough the website. 
Type Of Material Database/Collection of data 
Year Produced 2012 
Provided To Others? Yes  
Impact The website is visited by 5-10 visitors each day. They download materials such as articles, scripts and data. 
URL http://www.ag-myresearch.com/
 
Title R package mixmeta 
Description The package contains functions and data examples for running extended meta-analytical models. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Material Computer model/algorithm 
Year Produced 2019 
Provided To Others? Yes  
Impact The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. Impact on application of meta-analytical approaches, certified by the use of the technique in several peer-reviewed articles by different research groups: http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/10kxh85C77hGm5PFkch6qfQ/ 
URL http://cran.r-project.org/web/packages/mixmeta/index.html
 
Title R package mvmeta 
Description The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Material Computer model/algorithm 
Year Produced 2011 
Provided To Others? Yes  
Impact The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. Impact on application of meta-analytical approaches, certified by the use of the technique in several peer-reviewed articles by different research groups: http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/10kxh85C77hGm5PFkch6qfQ/ 
URL http://cran.r-project.org/web/packages/mvmeta/index.html
 
Description Methodological work on distributed lag linear and non-linear models 
Organisation London School of Hygiene and Tropical Medicine (LSHTM)
Country United Kingdom 
Sector Academic/University 
PI Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Collaborator Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2014
 
Description Methodological work on distributed lag linear and non-linear models 
Organisation Ludwig Maximilian University of Munich (LMU Munich)
Department Department of Neurology
Country Germany 
Sector Academic/University 
PI Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Collaborator Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2014
 
Description Modelling health effects of environmental exposures 
Organisation Children's Hospital of Philadelphia
Department Center for Pediatric Clinical Effectiveness
Country United States 
Sector Hospitals 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Modelling health effects of environmental exposures 
Organisation Columbia University
Department Department of Environmental Health Sciences
Country United States 
Sector Academic/University 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Modelling health effects of environmental exposures 
Organisation Harvard University
Department Harvard T.H. Chan School of Public Health
Country United States 
Sector Academic/University 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Modelling health effects of environmental exposures 
Organisation Public Health Agency of Canada
Department Canada Prenatal Nutrition Program
Country Canada 
Sector Public 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Modelling health effects of environmental exposures 
Organisation University of Hasselt
Department Centre for Environmental Sciences
Country Belgium 
Sector Academic/University 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Modelling health effects of environmental exposures 
Organisation University of Leuven
Department Department of Public Health and Primary Care
Country Belgium 
Sector Academic/University 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Modelling health effects of environmental exposures 
Organisation University of Ottawa
Country Canada 
Sector Academic/University 
PI Contribution Technical contribution on statistical methods and software programs, intellectual contribution to publications
Collaborator Contribution Leading the development of the methodological research, leading on writing up of peer-review publications
Impact Peer-reviewed publications in international journals and oral presentations in international congresses
Start Year 2012
 
Description Multi-Country Multi-City (MCC) Collaborative Research Network 
Organisation Harvard University
Department Harvard T.H. Chan School of Public Health
Country United States 
Sector Academic/University 
PI Contribution I have established and currently coordinating an international collaboration of more than 80 researchers from more than 40 countries, working on a program aiming to produce epidemiological evidence on associations between environmental stressors, climate, and health (http://mccstudy.lshtm.ac.uk/). The list of partners is long: see http://mccstudy.lshtm.ac.uk/participants/.
Collaborator Contribution It is collaborative network that has produced already important research outputs (http://mccstudy.lshtm.ac.uk/publications/).
Impact http://mccstudy.lshtm.ac.uk/publications/
Start Year 2013
 
Description Spatio-temporal modelling of environmental exposures 
Organisation Ben-Gurion University of the Negev
Country Israel 
Sector Academic/University 
PI Contribution Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK
Collaborator Contribution Data provision, technical assistance, expertise in modelling
Impact Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology
Start Year 2018
 
Description Spatio-temporal modelling of environmental exposures 
Organisation European Centre for Medium Range Weather Forecasting ECMWF
Country United Kingdom 
Sector Public 
PI Contribution Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK
Collaborator Contribution Data provision, technical assistance, expertise in modelling
Impact Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology
Start Year 2018
 
Description Spatio-temporal modelling of environmental exposures 
Organisation European Space Agency
Country France 
Sector Public 
PI Contribution Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK
Collaborator Contribution Data provision, technical assistance, expertise in modelling
Impact Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology
Start Year 2018
 
Description Spatio-temporal modelling of environmental exposures 
Organisation Lazio Regional Health Service
Country Italy 
Sector Public 
PI Contribution Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK
Collaborator Contribution Data provision, technical assistance, expertise in modelling
Impact Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology
Start Year 2018
 
Description Spatio-temporal modelling of environmental exposures 
Organisation Swiss Tropical & Public Health Institute
Country Switzerland 
Sector Academic/University 
PI Contribution Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK
Collaborator Contribution Data provision, technical assistance, expertise in modelling
Impact Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology
Start Year 2018
 
Title R package dlnm 
Description The package contains functions and data examples for running distributed lag non-linear models. The software is freely downloadable by everybody, and it is licensed under the GNU General Public License, meaning that, under appropriate reference and the assurance that novel material is provided under the same licence terms, it can be modified and extended by other researchers. 
IP Reference  
Protection Copyrighted (e.g. software)
Year Protection Granted 2009
Licensed Yes
Impact The software implementation in a free program has boosted the use of DLNMs among researchers in different countries, primarily (but not only) for studies on temperature and air pollution. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians.
 
Title R package mixmeta 
Description The package contains functions and data examples for running extended meta-analytical models. The software is freely downloadable by everybody, and it is licensed under the GNU General Public License, meaning that, under appropriate reference and the assurance that novel material is provided under the same licence terms, it can be modified and extended by other researchers. 
IP Reference  
Protection Copyrighted (e.g. software)
Year Protection Granted 2019
Licensed Yes
Impact The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians.
 
Title R package mvmeta 
Description The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. The software is freely downloadable by everybody, and it is licensed under the GNU General Public License, meaning that, under appropriate reference and the assurance that novel material is provided under the same licence terms, it can be modified and extended by other researchers. 
IP Reference  
Protection Copyrighted (e.g. software)
Year Protection Granted 2011
Licensed Yes
Impact The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians.
 
Title R package dlnm 
Description The package contains functions and data examples for running distributed lag non-linear models. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Technology Software 
Year Produced 2009 
Open Source License? Yes  
Impact The software implementation in a free program has boosted the use of DLNMs among researchers in different countries, primarily (but not only) for studies on temperature and air pollution. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. 
URL http://cran.r-project.org/web/packages/dlnm/index.html
 
Title R package mixmeta 
Description The package contains functions and data examples for running extended meta-analytical models. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. 
URL http://cran.r-project.org/web/packages/mixmeta/index.html
 
Title R package mvmeta 
Description The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. It is provided within the free software R and downloadable from internet through the R program. 
Type Of Technology Software 
Year Produced 2011 
Open Source License? Yes  
Impact The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. 
URL http://cran.r-project.org/web/packages/mvmeta/index.html
 
Description Centre for Statistical Modelling (CSM) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? Yes
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Centre of the London School of Hygiene & Tropical Medicine

Seminars and other activities are usually attended by 50-100 researchers, PhD or MSc students
Year(s) Of Engagement Activity 2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
URL http://csm.lshtm.ac.uk/
 
Description Centre on Climate Change & Planetary Health 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Centre of the London School of Hygiene & Tropical Medicine
Year(s) Of Engagement Activity 2019,2020
URL https://www.lshtm.ac.uk/research/centres/centre-climate-change-and-planetary-health
 
Description Invited talks 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Series of invited talks and workshops in well-known research institutions and companies, such as Harvard School of Public Health, Karolinska Institute, Royal Statistical Society, The Children Hospital of Philadelphia, Centre for Research in Environmental Epidemiology (CREAL), University of Pennsylvania, Ludwig Maximilians University, Swiss Tropical and Public Health Institute, Open University, St George's University of London, IQVIA, European Centre for Medium-Range Weather Forecasts (ECMWF), Centers for Disease Control and Prevention (CDC), Emory University
Year(s) Of Engagement Activity 2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
 
Description LSHTM R Users Group 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Group promoting the use of the freely-available software R within the staff and PhD students
Year(s) Of Engagement Activity 2018,2019,2020
URL http://blogs.lshtm.ac.uk/rusers/