Geostatistical methods for disease risk-mapping
Lead Research Organisation:
Lancaster University
Department Name: Medicine
Abstract
What is the research trying to achieve?
The fundamental aim of the proposed research project is to extend the classical geostatistical framework [1] by adressing three specific issues that are needed for the correct interpretation of data from disease prevalence surveys in poor countries, where health records for the whole population do not exist.
1. How should data from multiple prevalence surveys be combined in order to account for data-quality variation, for example when some of the surveys uses so-called convenience sampling and may therefore not be representative of the underlying population at risk?
2. A zero prevalence estimate in a particular community can be either a chance finding, or a necessary consequence of the community being disease/infection-free. How can prevalence data be analysed to recognize these two different phenomena, and to correctly interpret data that may contain both kinds of zeros?
3. A key aim in any epidemiological study is to understand the relationship between exposure and risk. When exposure can only be measured imprecisely, this needs to be recognized to avoid biasing estimated of risk. What is the best way to do this in a geostatistical setting, i.e. when both exposure and risk vary geographically?
Why is this important?
Policy makers will use our methodology to better inform the implementation of disease control programmes and make the most efficient use of available resources. Yet in resource poor settings the available data have numerous biases and limitations, which could hinder effective allocation of resources. The application of my research to practical disease control will be achieved via carefully chosen collaborative links with colleagues who are directly involved with in-country public health agencies.
To which diseases is the research relevant?
The developed methodology will be broadly applicable to any infectious disease. However, our specific applications will be in malaria and neglected tropical diseases, including malaria in Malawi, Tanzania and Ethiopia, and lymphatic filariasis, soil-transmitted helminths and schistosomiasis in all of the African countries where these are endemic.
How will the aim be achieved?
The proposed research project will develop high-quality methodology and apply this to important public health problems through collaborations with leading experts from London School of Tropical Medicine & Hygiene, Liverpool School of Tropical Medicine, International Institute for Climate and Society at Columbia University, and the Norwegian University of Science and Technology. During the fellowship open-source statistical software and substantive findings will be made available online. More details on how I intend to pursue these objectives are given in "Communications plan", "Impact summary" and "Case for support".
References
[1] Diggle, P. J., Tawn, J. A., Moyeed, R. A. (2002) Model-based geostatistics. Journal of the Royal Statistical Society, Series C, 47:299-350.
[2] Hay, S.I., Guerra, C.A., Gething, P.W., Patil, A.P., Tatem, A.J., Noor, A.M., Kabaria, C.W., Manh, B.H., Elyazar, I.R.F., Brooker, S., Smith, D.L., Moyeed, R.A. and Snow, R.W. (2009) A world malaria map: Plasmodium falciparum endemicity in 2007. PLoS Medicine 6. e1000048.
The fundamental aim of the proposed research project is to extend the classical geostatistical framework [1] by adressing three specific issues that are needed for the correct interpretation of data from disease prevalence surveys in poor countries, where health records for the whole population do not exist.
1. How should data from multiple prevalence surveys be combined in order to account for data-quality variation, for example when some of the surveys uses so-called convenience sampling and may therefore not be representative of the underlying population at risk?
2. A zero prevalence estimate in a particular community can be either a chance finding, or a necessary consequence of the community being disease/infection-free. How can prevalence data be analysed to recognize these two different phenomena, and to correctly interpret data that may contain both kinds of zeros?
3. A key aim in any epidemiological study is to understand the relationship between exposure and risk. When exposure can only be measured imprecisely, this needs to be recognized to avoid biasing estimated of risk. What is the best way to do this in a geostatistical setting, i.e. when both exposure and risk vary geographically?
Why is this important?
Policy makers will use our methodology to better inform the implementation of disease control programmes and make the most efficient use of available resources. Yet in resource poor settings the available data have numerous biases and limitations, which could hinder effective allocation of resources. The application of my research to practical disease control will be achieved via carefully chosen collaborative links with colleagues who are directly involved with in-country public health agencies.
To which diseases is the research relevant?
The developed methodology will be broadly applicable to any infectious disease. However, our specific applications will be in malaria and neglected tropical diseases, including malaria in Malawi, Tanzania and Ethiopia, and lymphatic filariasis, soil-transmitted helminths and schistosomiasis in all of the African countries where these are endemic.
How will the aim be achieved?
The proposed research project will develop high-quality methodology and apply this to important public health problems through collaborations with leading experts from London School of Tropical Medicine & Hygiene, Liverpool School of Tropical Medicine, International Institute for Climate and Society at Columbia University, and the Norwegian University of Science and Technology. During the fellowship open-source statistical software and substantive findings will be made available online. More details on how I intend to pursue these objectives are given in "Communications plan", "Impact summary" and "Case for support".
References
[1] Diggle, P. J., Tawn, J. A., Moyeed, R. A. (2002) Model-based geostatistics. Journal of the Royal Statistical Society, Series C, 47:299-350.
[2] Hay, S.I., Guerra, C.A., Gething, P.W., Patil, A.P., Tatem, A.J., Noor, A.M., Kabaria, C.W., Manh, B.H., Elyazar, I.R.F., Brooker, S., Smith, D.L., Moyeed, R.A. and Snow, R.W. (2009) A world malaria map: Plasmodium falciparum endemicity in 2007. PLoS Medicine 6. e1000048.
Technical Summary
The overall aim is to develop and apply novel geostatistical methods for disease-risk mapping.
Specific objectives
A. Combining data from multiple spatially referenced surveys, including a mix of randomised and non-randomised surveys;
B. Geostatistical modelling of zero-inflated prevalence data;
C. Multivariate spatial modelling of risk and exposure when exposures are measured incompletely and/or with error.
D. Development of efficient computational procedures for analysis of large spatial data-sets.
Methods
In order to pursue objectives A, B and C, a review will first be carried out so as to identify existing statistical methods for each specific issue. New statistical models and associated likelihood-based methods will then be developed as discussed in 'Case for support'. Simulation studies will be conducted under different suitable scenarios in order to quantify the benefit of the newly developed methodology over existing approaches, in each objective A, B and C. Applications to real data-sets provided by our collaborators will also be conducted. For objective D, two different approaches will be explored: 1) low-rank approximations of spatial processes based on convolution kernel representations; 2) ''tapering'' techniques, i.e. computational procedures that treat observations at locations sufficiently far apart as independent responses. The resulting algorithms will then be used to fit the proposed models to the (simulated and real) data, and compared with existing approaches that use analytical approximations, such as [1].
Opportunities
The developed methodology will mainly be applied to disease prevalence mapping of tropical diseases. Throughout the fellowship, we will be alert to new collaborative opportunities. For more details see 'Collaboration Explanations'.
References
[1] Rue H., Martino S. and Chopin N. (2009) Approximate Bayesian Inference for Latent Gaussian Models Using INLA. JRSSB, 71, 319-392.
Specific objectives
A. Combining data from multiple spatially referenced surveys, including a mix of randomised and non-randomised surveys;
B. Geostatistical modelling of zero-inflated prevalence data;
C. Multivariate spatial modelling of risk and exposure when exposures are measured incompletely and/or with error.
D. Development of efficient computational procedures for analysis of large spatial data-sets.
Methods
In order to pursue objectives A, B and C, a review will first be carried out so as to identify existing statistical methods for each specific issue. New statistical models and associated likelihood-based methods will then be developed as discussed in 'Case for support'. Simulation studies will be conducted under different suitable scenarios in order to quantify the benefit of the newly developed methodology over existing approaches, in each objective A, B and C. Applications to real data-sets provided by our collaborators will also be conducted. For objective D, two different approaches will be explored: 1) low-rank approximations of spatial processes based on convolution kernel representations; 2) ''tapering'' techniques, i.e. computational procedures that treat observations at locations sufficiently far apart as independent responses. The resulting algorithms will then be used to fit the proposed models to the (simulated and real) data, and compared with existing approaches that use analytical approximations, such as [1].
Opportunities
The developed methodology will mainly be applied to disease prevalence mapping of tropical diseases. Throughout the fellowship, we will be alert to new collaborative opportunities. For more details see 'Collaboration Explanations'.
References
[1] Rue H., Martino S. and Chopin N. (2009) Approximate Bayesian Inference for Latent Gaussian Models Using INLA. JRSSB, 71, 319-392.
Planned Impact
The proposed methodological research will be developed and validated using both research and programmatic data generated within disease control programmes in developing countries. This will allow any of the resulting methods to feed directly into policy-relevant applications that will in turn inform malaria and NTD control. The non-academic beneficiaries will thus include policy makers and staff of disease control programmes in malaria NTD endemic countries. These include all the African countries where lymphatic filariasis, soil-transmitted helminths and schistosomiasis are endemic; as for malaria endemic countries, a particular focus will be on Malawi, Tanzania and Ethiopia.
The results of the research will inform public health policies and decision-making in each of the countries and disease areas covered by our specific applications, through our collaborative links with colleagues who are directly involved with in-country public health agencies. Specifically, I will collaborate with the Global Atlas of Helminth Infection at LSHTM which develops and maintains a suite of tools and resources for the mapping of NTD. The resources which I will develop will be made available on the GAHI training portal (www.thiswormyworld.org) and I will work to develop appropriate training material for collaborative short courses run by GAHI and partners, including Prof. Peter Diggle.
Discussions in the Roll Back Malaria programme's Monitoring and Evaluation Reference Group (MERG), of which Dr Terlouw is a member, have resulted in various national malaria control programmes stressing the need for more affordable, timely, sub-district data collection and analysis tools that provide more accurate risk maps of the spatial heterogeneity in disease burden. The proposed expansion and validation of methodology to combine mixed source prevalence data with convenience and random sampling frames will therefore be of direct benefit to ongoing malaria monitoring and evaluation projects in Malawi and indirectly, through dissemination of the results of this specific application and the underlying methodology, to the wider international monitoring and evaluation community.
Our approach will specifically inform hybrid sampling strategies from randomized and non-randomized surveys in order to minimize costs without compromising the validity of the resulting inferences, leading to more accurate and more fine-grain disease risk maps that will inform targeted control efforts and the identification of local transmission hot-spots. This will become increasingly important as interventions serve to reduce transmission levels and make diseases more focal. Validated tools will be presented for consideration to be included in the international MERG recommendations for survey methodologies to maximize their impact.
Applications of the improved methodologies to programmatic evaluation tools and analyses will benefit the health and well-being of communities at risk as a result of more effective control programmes. In particular, by monitoring the changes in spatial risk maps over time, our methods will highlight the strengths of more spatially targeted control measures towards high burden areas, compared to the current blanket control efforts.
Contributing to the improvement of the effectiveness of control programmes for the benefit of human health is consistent with my personal ambition to conduct high-quality research that directly benefits impoverished communities.
Policy makers will have free access to our research outputs. For the specific identified applications, the knock-on impact on policy-development will be realised within the three-year term of the fellowship. The time-scale for the wider benefits will be longer, owing to the need to publish case-study reports and, through this and other dissemination activities including training courses run by GAHI, build new collaborations and spatial statistics capacity in the countries concerned.
The results of the research will inform public health policies and decision-making in each of the countries and disease areas covered by our specific applications, through our collaborative links with colleagues who are directly involved with in-country public health agencies. Specifically, I will collaborate with the Global Atlas of Helminth Infection at LSHTM which develops and maintains a suite of tools and resources for the mapping of NTD. The resources which I will develop will be made available on the GAHI training portal (www.thiswormyworld.org) and I will work to develop appropriate training material for collaborative short courses run by GAHI and partners, including Prof. Peter Diggle.
Discussions in the Roll Back Malaria programme's Monitoring and Evaluation Reference Group (MERG), of which Dr Terlouw is a member, have resulted in various national malaria control programmes stressing the need for more affordable, timely, sub-district data collection and analysis tools that provide more accurate risk maps of the spatial heterogeneity in disease burden. The proposed expansion and validation of methodology to combine mixed source prevalence data with convenience and random sampling frames will therefore be of direct benefit to ongoing malaria monitoring and evaluation projects in Malawi and indirectly, through dissemination of the results of this specific application and the underlying methodology, to the wider international monitoring and evaluation community.
Our approach will specifically inform hybrid sampling strategies from randomized and non-randomized surveys in order to minimize costs without compromising the validity of the resulting inferences, leading to more accurate and more fine-grain disease risk maps that will inform targeted control efforts and the identification of local transmission hot-spots. This will become increasingly important as interventions serve to reduce transmission levels and make diseases more focal. Validated tools will be presented for consideration to be included in the international MERG recommendations for survey methodologies to maximize their impact.
Applications of the improved methodologies to programmatic evaluation tools and analyses will benefit the health and well-being of communities at risk as a result of more effective control programmes. In particular, by monitoring the changes in spatial risk maps over time, our methods will highlight the strengths of more spatially targeted control measures towards high burden areas, compared to the current blanket control efforts.
Contributing to the improvement of the effectiveness of control programmes for the benefit of human health is consistent with my personal ambition to conduct high-quality research that directly benefits impoverished communities.
Policy makers will have free access to our research outputs. For the specific identified applications, the knock-on impact on policy-development will be realised within the three-year term of the fellowship. The time-scale for the wider benefits will be longer, owing to the need to publish case-study reports and, through this and other dissemination activities including training courses run by GAHI, build new collaborations and spatial statistics capacity in the countries concerned.
People |
ORCID iD |
| Emanuele Giorgi (Principal Investigator / Fellow) |
Publications
Landier J
(2018)
Spatiotemporal analysis of malaria for new sustainable control strategies
in BMC Medicine
Jary HR
(2017)
Household air pollution, chronic respiratory disease and pneumonia in Malawian adults: A case-control study.
in Wellcome open research
Henning J
(2017)
Factors influencing the success of aerial rabies vaccination of foxes.
in Scientific reports
Hale AC
(2019)
A real-time spatio-temporal syndromic surveillance system with application to small companion animals.
in Scientific reports
| Description | Modelling of spatio-temporal variation in plague incidence in Madagascar |
| Organisation | University of Liverpool |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | We carried out the developed novel statistical methodology, analysed the data and interpreted the results |
| Collaborator Contribution | Out partners provided the data, and contributed to the interpretation of the results. |
| Impact | Paper submitted and currently under review for "Biometrics". |
| Start Year | 2015 |
| Description | Prevalence mapping and visualization of repeated cross-sectional survey data |
| Organisation | Kenyan Institute for Medical Research (KEMRI) |
| Country | Kenya |
| Sector | Public |
| PI Contribution | We are carrying out the following tasks: statistical modelling, interpretation of the results and software implementation. |
| Collaborator Contribution | Our partners have provided the data and give significant contribution in the interpretation of the results. |
| Impact | Predictive maps of malaria in Senegal from 1905 to 2014 available at giorgi.shinyapps.io/example |
| Start Year | 2015 |
| Title | PrevMap package in the R software environment |
| Description | The software implements state-of-the-art statistical models for mapping disease risk. |
| Type Of Technology | Software |
| Year Produced | 2014 |
| Open Source License? | Yes |
| Impact | The software is being widely used by researchers in low-resource settings for prevalence mapping of tropical diseases. |
| URL | https://cran.r-project.org/web/packages/PrevMap/index.html |