The case time series design: a new tool for big data analysis
Lead Research Organisation:
London School of Hygiene & Tropical Medicine
Department Name: Public Health and Policy
Abstract
Biomedical research has been transformed by the recent development of big data technologies. For instance, the collection of health records in linked electronic databases provide information on demographics, health events, medications, and lifestyle factor for large samples of patients. Similarly, portable devices such as mobile phones provide the possibility to recruit large number of participants, and to collect real-time and geo-located individual-level measurements. While these resources offer the possibility to answer research questions that could not be feasibly addressed using traditional studies, they require innovative analytical approaches.
This project contributes to address this issue through the development of a novel analytical method called case time series design. This design combines features of existing approaches to generate a more adaptable tool, particularly well-suited for the analysis of highly-informative big data resources. The case time series design is applicable in different research areas for investigating health effects associated with various risk factors, such as environmental exposures, clinical conditions, or drug use. The research proposal is structured in a detailed work plan that includes the methodological development of the case time series design and examples that illustrate its application in various research areas within biomedical research.
This project contributes to address this issue through the development of a novel analytical method called case time series design. This design combines features of existing approaches to generate a more adaptable tool, particularly well-suited for the analysis of highly-informative big data resources. The case time series design is applicable in different research areas for investigating health effects associated with various risk factors, such as environmental exposures, clinical conditions, or drug use. The research proposal is structured in a detailed work plan that includes the methodological development of the case time series design and examples that illustrate its application in various research areas within biomedical research.
Technical Summary
Big data technologies are transforming the landscape of biomedical research. Epidemiological investigations can now rely on repeated measurements collected at individual level in large populations, linked with a wealth of data on personal characteristics and various risk factors. These resources offer the possibility to answer research questions that could not be feasibly addressed using traditional studies. However, this more complex big data setting presents important methodological challenges, and it requires innovative analytical approaches to model complex longitudinal associations between repeated measures of health outcomes and time-varying exposures using observational data.
This project contributes to address this issue through the development of a novel analytical tool called case time series design. This design offers an adaptable framework that combines the individual-level setting and ability to control for confounders of case-only methods such as the self-controlled case series, with the flexibility and temporal structure of time series models. It represents a general tool, applicable in different research areas for investigating associations with environmental exposures, clinical conditions, or drug use. The case time series design is suitable for the analysis of highly-informative big data resources, particularly those providing individual profiles with longitudinal measures of health outcomes and time-varying predictors.
The research proposal is structured in a detailed work plan. This includes the definition of the design setting and assumptions of the case time series, the description of the statistical framework, and simulation studies to validate the methodology. The novel design is illustrated in four case studies that demonstrate its application in various research areas within biomedical research, such as environmental, clinical, and pharmaco-epidemiology.
This project contributes to address this issue through the development of a novel analytical tool called case time series design. This design offers an adaptable framework that combines the individual-level setting and ability to control for confounders of case-only methods such as the self-controlled case series, with the flexibility and temporal structure of time series models. It represents a general tool, applicable in different research areas for investigating associations with environmental exposures, clinical conditions, or drug use. The case time series design is suitable for the analysis of highly-informative big data resources, particularly those providing individual profiles with longitudinal measures of health outcomes and time-varying predictors.
The research proposal is structured in a detailed work plan. This includes the definition of the design setting and assumptions of the case time series, the description of the statistical framework, and simulation studies to validate the methodology. The novel design is illustrated in four case studies that demonstrate its application in various research areas within biomedical research, such as environmental, clinical, and pharmaco-epidemiology.
Planned Impact
There are several reasons to expect a strong impact from the proposed project. Specifically:
- A detailed publication plan of peer-reviewed contributions, illustrating methodological developments and substantive analyses, targeting high-impact journals, selected with the aim of maximizing the impact and widening the potential audience, and a preference for open access options;
- The flexibility and generality of the case time series methodology, applicable in a variety of settings and research areas, as demonstrated by the four case studies;
- The collaboration network including researchers with an established experience in various research fields, and a strong expertise in both methodological developments and applications in real-data analyses;
- The implementation of the analytical techniques in freely available statistical programs, which will facilitate the independent application by other research teams;
- The provision of tutorials, software documentation and other material about the use of these novel analytical tools;
- The multidisciplinary background and track record of the applicant and his research team, and their experience in developing high-impact analytical techniques and in implementing them in well documented software.
- A detailed publication plan of peer-reviewed contributions, illustrating methodological developments and substantive analyses, targeting high-impact journals, selected with the aim of maximizing the impact and widening the potential audience, and a preference for open access options;
- The flexibility and generality of the case time series methodology, applicable in a variety of settings and research areas, as demonstrated by the four case studies;
- The collaboration network including researchers with an established experience in various research fields, and a strong expertise in both methodological developments and applications in real-data analyses;
- The implementation of the analytical techniques in freely available statistical programs, which will facilitate the independent application by other research teams;
- The provision of tutorials, software documentation and other material about the use of these novel analytical tools;
- The multidisciplinary background and track record of the applicant and his research team, and their experience in developing high-impact analytical techniques and in implementing them in well documented software.
Organisations
- London School of Hygiene & Tropical Medicine (Lead Research Organisation)
- National Institute for Health Research (Co-funder)
- HARVARD UNIVERSITY (Collaboration)
- Columbia University (Collaboration)
- Ben-Gurion University of the Negev (Collaboration)
- Swiss Tropical & Public Health Institute (Collaboration)
- University of Ottawa (Collaboration)
- European Centre for Medium Range Weather Forecasting ECMWF (Collaboration)
- University of Hasselt (Collaboration)
- Public Health Agency of Canada (Collaboration)
- Lazio Regional Health Service (Collaboration)
- London School of Hygiene and Tropical Medicine (LSHTM) (Collaboration)
- Ludwig Maximilian University of Munich (LMU Munich) (Collaboration)
- European Space Agency (Collaboration)
- Children's Hospital of Philadelphia (Collaboration)
- University of Leuven (Collaboration)
Publications
Lavigne E
(2019)
Spatial variations in ambient ultrafine particle concentrations and risk of congenital heart defects.
in Environment international
Dixon W
(2019)
How the weather affects the pain of citizen scientists using a smartphone app
in npj Digital Medicine
Armstrong B
(2019)
Erratum: "The Role of Humidity in Associations of High Temperature with Mortality: A Multicountry, Multicity Study".
in Environmental health perspectives
Lo YTE
(2019)
Increasing mitigation ambition to meet the Paris Agreement's temperature goal avoids substantial heat-related mortality in U.S. cities.
in Science advances
Stieb D
(2019)
Air pollution in the week prior to delivery and preterm birth in 24 Canadian cities: a time to event analysis
in Environmental Health
Sera F
(2019)
How urban characteristics affect vulnerability to heat and cold: a multi-country analysis
in International Journal of Epidemiology
Onozuka D
(2019)
Modeling Future Projections of Temperature-Related Excess Morbidity due to Infectious Gastroenteritis under Climate Change Conditions in Japan.
in Environmental health perspectives
Liu C
(2019)
Ambient Particulate Air Pollution and Daily Mortality in 652 Cities.
in The New England journal of medicine
Kim Y
(2019)
Suicide and Ambient Temperature: A Multi-Country Multi-City Study.
in Environmental health perspectives
Liu C
(2019)
Ambient Air Pollution and Mortality in 652 Cities. Reply.
in The New England journal of medicine
Description | Modern methods for time series analysis |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Temperature, climate change and health |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Use of DLNMs in temperature-health studies |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
URL | http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/10kxh85C77hGm5PFkch6qfQ/ |
Title | Personal website |
Description | The website provides access to the outputs of my research, such as pdf versions and supplemental material of the published papers, summaries and updates of my research activity, and other information. In particular, scripts and data for reproducing the results of methodological or substantive papers are made available thorough the website. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2012 |
Provided To Others? | Yes |
Impact | The website is visited by 5-10 visitors each day. They download materials such as articles, scripts and data. |
URL | http://www.ag-myresearch.com/ |
Title | R package mvmeta |
Description | The package contains functions and data examples for running extended meta-analytical models. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
URL | http://cran.r-project.org/web/packages/mixmeta/index.html |
Title | R package mvmeta |
Description | The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2011 |
Provided To Others? | Yes |
Impact | The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
URL | http://cran.r-project.org/web/packages/mvmeta/index.html |
Title | Personal website |
Description | The website provides access to the outputs of my research, such as pdf versions and supplemental material of the published papers, summaries and updates of my research activity, and other information. In particular, scripts and data for reproducing the results of methodological or substantive papers are made available thorough the website. |
Type Of Material | Database/Collection of data |
Year Produced | 2012 |
Provided To Others? | Yes |
Impact | The website is visited by 5-10 visitors each day. They download materials such as articles, scripts and data. |
URL | http://www.ag-myresearch.com/ |
Title | R package mixmeta |
Description | The package contains functions and data examples for running extended meta-analytical models. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Material | Computer model/algorithm |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. Impact on application of meta-analytical approaches, certified by the use of the technique in several peer-reviewed articles by different research groups: http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/10kxh85C77hGm5PFkch6qfQ/ |
URL | http://cran.r-project.org/web/packages/mixmeta/index.html |
Title | R package mvmeta |
Description | The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Material | Computer model/algorithm |
Year Produced | 2011 |
Provided To Others? | Yes |
Impact | The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. Impact on application of meta-analytical approaches, certified by the use of the technique in several peer-reviewed articles by different research groups: http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/10kxh85C77hGm5PFkch6qfQ/ |
URL | http://cran.r-project.org/web/packages/mvmeta/index.html |
Description | Methodological work on distributed lag linear and non-linear models |
Organisation | London School of Hygiene and Tropical Medicine (LSHTM) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Collaborator Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2014 |
Description | Methodological work on distributed lag linear and non-linear models |
Organisation | Ludwig Maximilian University of Munich (LMU Munich) |
Department | Department of Neurology |
Country | Germany |
Sector | Academic/University |
PI Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Collaborator Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2014 |
Description | Modelling health effects of environmental exposures |
Organisation | Children's Hospital of Philadelphia |
Department | Center for Pediatric Clinical Effectiveness |
Country | United States |
Sector | Hospitals |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Modelling health effects of environmental exposures |
Organisation | Columbia University |
Department | Department of Environmental Health Sciences |
Country | United States |
Sector | Academic/University |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Modelling health effects of environmental exposures |
Organisation | Harvard University |
Department | Harvard T.H. Chan School of Public Health |
Country | United States |
Sector | Academic/University |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Modelling health effects of environmental exposures |
Organisation | Public Health Agency of Canada |
Department | Canada Prenatal Nutrition Program |
Country | Canada |
Sector | Public |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Modelling health effects of environmental exposures |
Organisation | University of Hasselt |
Department | Centre for Environmental Sciences |
Country | Belgium |
Sector | Academic/University |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Modelling health effects of environmental exposures |
Organisation | University of Leuven |
Department | Department of Public Health and Primary Care |
Country | Belgium |
Sector | Academic/University |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Modelling health effects of environmental exposures |
Organisation | University of Ottawa |
Country | Canada |
Sector | Academic/University |
PI Contribution | Technical contribution on statistical methods and software programs, intellectual contribution to publications |
Collaborator Contribution | Leading the development of the methodological research, leading on writing up of peer-review publications |
Impact | Peer-reviewed publications in international journals and oral presentations in international congresses |
Start Year | 2012 |
Description | Multi-Country Multi-City (MCC) Collaborative Research Network |
Organisation | Harvard University |
Department | Harvard T.H. Chan School of Public Health |
Country | United States |
Sector | Academic/University |
PI Contribution | I have established and currently coordinating an international collaboration of more than 80 researchers from more than 40 countries, working on a program aiming to produce epidemiological evidence on associations between environmental stressors, climate, and health (http://mccstudy.lshtm.ac.uk/). The list of partners is long: see http://mccstudy.lshtm.ac.uk/participants/. |
Collaborator Contribution | It is collaborative network that has produced already important research outputs (http://mccstudy.lshtm.ac.uk/publications/). |
Impact | http://mccstudy.lshtm.ac.uk/publications/ |
Start Year | 2013 |
Description | Spatio-temporal modelling of environmental exposures |
Organisation | Ben-Gurion University of the Negev |
Country | Israel |
Sector | Academic/University |
PI Contribution | Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK |
Collaborator Contribution | Data provision, technical assistance, expertise in modelling |
Impact | Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology |
Start Year | 2018 |
Description | Spatio-temporal modelling of environmental exposures |
Organisation | European Centre for Medium Range Weather Forecasting ECMWF |
Country | United Kingdom |
Sector | Public |
PI Contribution | Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK |
Collaborator Contribution | Data provision, technical assistance, expertise in modelling |
Impact | Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology |
Start Year | 2018 |
Description | Spatio-temporal modelling of environmental exposures |
Organisation | European Space Agency |
Country | France |
Sector | Public |
PI Contribution | Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK |
Collaborator Contribution | Data provision, technical assistance, expertise in modelling |
Impact | Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology |
Start Year | 2018 |
Description | Spatio-temporal modelling of environmental exposures |
Organisation | Lazio Regional Health Service |
Country | Italy |
Sector | Public |
PI Contribution | Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK |
Collaborator Contribution | Data provision, technical assistance, expertise in modelling |
Impact | Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology |
Start Year | 2018 |
Description | Spatio-temporal modelling of environmental exposures |
Organisation | Swiss Tropical & Public Health Institute |
Country | Switzerland |
Sector | Academic/University |
PI Contribution | Established the collaboration with several experts and institutions for the collection of data resources and development/application of machine learning methods to reconstruct high-resolution spatio-temporal maps of environmental exposures in the UK |
Collaborator Contribution | Data provision, technical assistance, expertise in modelling |
Impact | Multidisciplinary: remote sensing satellite products, re-analysis data repositories, machine learning, geospatial methods, epidemiology |
Start Year | 2018 |
Title | R package dlnm |
Description | The package contains functions and data examples for running distributed lag non-linear models. The software is freely downloadable by everybody, and it is licensed under the GNU General Public License, meaning that, under appropriate reference and the assurance that novel material is provided under the same licence terms, it can be modified and extended by other researchers. |
IP Reference | |
Protection | Copyrighted (e.g. software) |
Year Protection Granted | 2009 |
Licensed | Yes |
Impact | The software implementation in a free program has boosted the use of DLNMs among researchers in different countries, primarily (but not only) for studies on temperature and air pollution. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
Title | R package mixmeta |
Description | The package contains functions and data examples for running extended meta-analytical models. The software is freely downloadable by everybody, and it is licensed under the GNU General Public License, meaning that, under appropriate reference and the assurance that novel material is provided under the same licence terms, it can be modified and extended by other researchers. |
IP Reference | |
Protection | Copyrighted (e.g. software) |
Year Protection Granted | 2019 |
Licensed | Yes |
Impact | The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
Title | R package mvmeta |
Description | The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. The software is freely downloadable by everybody, and it is licensed under the GNU General Public License, meaning that, under appropriate reference and the assurance that novel material is provided under the same licence terms, it can be modified and extended by other researchers. |
IP Reference | |
Protection | Copyrighted (e.g. software) |
Year Protection Granted | 2011 |
Licensed | Yes |
Impact | The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
Title | R package dlnm |
Description | The package contains functions and data examples for running distributed lag non-linear models. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Technology | Software |
Year Produced | 2009 |
Open Source License? | Yes |
Impact | The software implementation in a free program has boosted the use of DLNMs among researchers in different countries, primarily (but not only) for studies on temperature and air pollution. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
URL | http://cran.r-project.org/web/packages/dlnm/index.html |
Title | R package mixmeta |
Description | The package contains functions and data examples for running extended meta-analytical models. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Technology | Software |
Year Produced | 2019 |
Open Source License? | Yes |
Impact | The software implementation in a free program facilitates the use of extended meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
URL | http://cran.r-project.org/web/packages/mixmeta/index.html |
Title | R package mvmeta |
Description | The package contains functions and data examples for running univariate or multivariate meta-analysis and meta-regression. It is provided within the free software R and downloadable from internet through the R program. |
Type Of Technology | Software |
Year Produced | 2011 |
Open Source License? | Yes |
Impact | The software implementation in a free program facilitates the use of multivariate meta-analytical techniques in the research community. Also, the package, developed in parallel with the statistical framework, offer a vehicle to promote the methodology and its use among non-statisticians. |
URL | http://cran.r-project.org/web/packages/mvmeta/index.html |
Description | Centre for Statistical Modelling (CSM) |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | Yes |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | Centre of the London School of Hygiene & Tropical Medicine Seminars and other activities are usually attended by 50-100 researchers, PhD or MSc students |
Year(s) Of Engagement Activity | 2011,2012,2013,2014,2015,2016,2017,2018,2019,2020 |
URL | http://csm.lshtm.ac.uk/ |
Description | Centre on Climate Change & Planetary Health |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Centre of the London School of Hygiene & Tropical Medicine |
Year(s) Of Engagement Activity | 2019,2020 |
URL | https://www.lshtm.ac.uk/research/centres/centre-climate-change-and-planetary-health |
Description | Invited talks |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Series of invited talks and workshops in well-known research institutions and companies, such as Harvard School of Public Health, Karolinska Institute, Royal Statistical Society, The Children Hospital of Philadelphia, Centre for Research in Environmental Epidemiology (CREAL), University of Pennsylvania, Ludwig Maximilians University, Swiss Tropical and Public Health Institute, Open University, St George's University of London, IQVIA, European Centre for Medium-Range Weather Forecasts (ECMWF), Centers for Disease Control and Prevention (CDC), Emory University |
Year(s) Of Engagement Activity | 2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020 |
Description | LSHTM R Users Group |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | Group promoting the use of the freely-available software R within the staff and PhD students |
Year(s) Of Engagement Activity | 2018,2019,2020 |
URL | http://blogs.lshtm.ac.uk/rusers/ |