Compressive Population Health: Cost-Effective Profiling of Prevalence for Multiple Non-Communicable Diseases via Health Data Science
Lead Research Organisation:
Coventry University
Department Name: Ctr for Intelligent Healthcare
Abstract
With a growing ageing population and changes in lifestyles, non-communicable diseases (NCD), e.g. heart disease, diabetes, and cancer, have become extremely prevalent in our society, and the situation is more challenging in UK compared to other developed countries. Population health monitoring is fundamental block for public health services, and profiling population-scale prevalence of multiple NCD across different regions (e.g., building the spatially fine-grained morbidity rate map) is one of the most important tasks. However, traditional public health data collection and prevalence profiling approaches, such as clinic-visit-based data integration and health surveys, are often very costly and time-consuming. This project proposes a novel paradigm, called compressive population health (CPH for short), to reduce the data collection cost during the profiling of prevalence to the maximum extent.
The basic idea CPH is that a subset of areas is intelligently selected for data collection and population health profiling in the traditional way, while leveraging inherent data correlations to perform data inference for the rest of the areas. CPH is facilitated by the exploitation of the following types of inherent data correlations found by epidemiologists. (a) Intra-Disease Spatial Correlations. That is, regions are more similar in the prevalence rate of some diseases when they are neighbouring, or share certain common environmental, socioeconomic, and demographical attributes. (b) Inter-Disease Correlations. Multimorbidity, commonly defined as the co-presence of two or more chronic conditions, demonstrates that statistics for different types of disease may also correlate with each other. For example, regions with higher obesity rate are more likely to have higher rates of heart disease and cancers.
In order to realize this idea, this project develops three technical work packages to accomplish the following technical goals: (1) Investigate and extract latent data correlations and further utilize them to build learning models for prevalence inference on the target geographical grids. (2) Design intelligent algorithms for selecting traditional-sensed areas for each disease with multi-objective optimization goals including cost, reliability, and latency. (3) Evaluate and interpret the inference results of prevalence rate to ensure the reliability and robustness of the approach.
The proposed CPH is a novel solution to a public health data collection challenge enabled by data science and artificial intelligence. It opens the door for a disruptive population health monitoring paradigm with potential significant cost reductions for public health authorities. By closely working with partners from public health sector, including NHS England and Public Health at Warwickshire County Council, we will evaluate the feasibility of this approach based on multiple public health datasets together with relevant demographic/geographic statistics in the same regions.
The basic idea CPH is that a subset of areas is intelligently selected for data collection and population health profiling in the traditional way, while leveraging inherent data correlations to perform data inference for the rest of the areas. CPH is facilitated by the exploitation of the following types of inherent data correlations found by epidemiologists. (a) Intra-Disease Spatial Correlations. That is, regions are more similar in the prevalence rate of some diseases when they are neighbouring, or share certain common environmental, socioeconomic, and demographical attributes. (b) Inter-Disease Correlations. Multimorbidity, commonly defined as the co-presence of two or more chronic conditions, demonstrates that statistics for different types of disease may also correlate with each other. For example, regions with higher obesity rate are more likely to have higher rates of heart disease and cancers.
In order to realize this idea, this project develops three technical work packages to accomplish the following technical goals: (1) Investigate and extract latent data correlations and further utilize them to build learning models for prevalence inference on the target geographical grids. (2) Design intelligent algorithms for selecting traditional-sensed areas for each disease with multi-objective optimization goals including cost, reliability, and latency. (3) Evaluate and interpret the inference results of prevalence rate to ensure the reliability and robustness of the approach.
The proposed CPH is a novel solution to a public health data collection challenge enabled by data science and artificial intelligence. It opens the door for a disruptive population health monitoring paradigm with potential significant cost reductions for public health authorities. By closely working with partners from public health sector, including NHS England and Public Health at Warwickshire County Council, we will evaluate the feasibility of this approach based on multiple public health datasets together with relevant demographic/geographic statistics in the same regions.
Publications
Chang Q
(2023)
Deep Compressed Sensing based Data Imputation for Urban Environmental Monitoring
in ACM Transactions on Sensor Networks
Chen D
(2021)
Enabling Cost-Effective Population Health Monitoring By Exploiting Spatiotemporal Correlation An Empirical Study
in ACM Transactions on Computing for Healthcare
Chen L
(2022)
Human-in-the-loop machine learning with applications for population health
in CCF Transactions on Pervasive Computing and Interaction
Chen L
(2023)
Quality-Guaranteed and Cost-Effective Population Health Profiling: A Deep Active Learning Approach
in ACM Transactions on Computing for Healthcare
Feng Y
(2023)
Spatial-Attention and Demographic-Augmented Generative Adversarial Imputation Network for Population Health Data Reconstruction
in IEEE Transactions on Big Data
Feng Y
(2023)
Towards Sustainable Compressive Population Health: A GAN-based Year-By-Year Imputation Method
in ACM Transactions on Computing for Healthcare
Wang J
(2021)
Crowd-Machine Hybrid Urban Sensing and Computing
in Computer
Wang J
(2023)
Mobile Crowdsourcing - From Theory to Practice
Wang J
(2024)
Toward Population Health Intelligence: When Artificial Intelligence Meets Population Health Research
in Computer
| Description | The proof of concept has been proven to be effective through preliminary studies on a decade's data of diseases such as obesity, diabetes, and hypertension, across 500+ areas (wards) in London, indicating a potential to reduce the profiling cost by an average of 78% while ensuring the required inference accuracy . |
| Exploitation Route | As life expectancies increase, the UK faces the societal and economic challenges of needing more resources in public health surveillance with less revenue available. The capability of health data reconstruction will help understand the health inequalities and assess whether the strategies in different areas are making a real difference, particularly for regions that previously lacked public health data around disease prevalence. With a better understanding and more precise assessment, this research will further benefit administrators, epidemiologists, policymakers, and urban planners, in terms of co-developing evidence-based health intervention strategies at the right time, for the right population. An additional benefit of this project is that given the NCD prevalence data of a set of profiled areas, CPH can reconstruct the data for non-profiled areas. Given a fixed budget, it achieves higher spatial coverage of health monitoring than other existing methods. At the same time, given targeted coverage, it reduces the data collection workload and thus produces financial savings for governments and taxpayers. |
| Sectors | Digital/Communication/Information Technologies (including Software) Environment Healthcare |
| Description | Societal & Economical Impact: As a nation we are living longer than ever before, UK faces the challenges of needing to spend more resources in public health surveillance and with less revenue available. The primary benefit of this project is the significant reduction of the population health monitoring cost for public health authorities while ensuring a certain degree of reliability. Academic & International Impact: Beyond a cost-effective solution to a pragmatic public health challenge (NCD prevalence profiling), this project also explores how to effectively integrate both intra- and inter-disease correlations in the health data science. This may have wider implications such as predicting more relevant health indicators or facilitating innovative health interventions. We will disseminate our research outcome through publications in highly ranked conferences and journals (e.g., ICDM, AAAI, Ubicomp, Lancet Public Health), and develop prototypes of which we will make available to both the research community and public health administrators. Also, we will build up collaborations with world-leading research groups through academic visits and multiple workshops. In addition, the outcomes have the potential to be adopted worldwide with appropriate customizations, strengthening the UK's international leadership in health data science research. Knowledge Generation. As a by-product, some of the extracted data correlations may not be studied in-depth by the epidemiologists before, which can be regarded as new knowledge and may have future implications in terms of health prediction, intervention and evidence-based policy making. |
| Sector | Digital/Communication/Information Technologies (including Software),Environment,Healthcare |
| Description | Collaboration with University of Padova, Italy |
| Organisation | University of Padova |
| Country | Italy |
| Sector | Academic/University |
| PI Contribution | Dr Jiangtao Wang recently concluded a productive one-week visit to Italy, engaging in fruitful discussions with Professor Bruno Arpino and his team at the University of Padova. During the visit, Jiangtao delivered an insightful talk on the use of AI in population research, highlighting the transformative potential of AI technologies in analyzing demographic trends. The discussions extended beyond the lecture hall, with both sides exploring future collaborations, particularly on EU bids. |
| Collaborator Contribution | The partners support my application to AI collaboration funding for Alan Turing Institute (although not successful). |
| Impact | submitted grant to Alan Turing Institue |
| Start Year | 2023 |
| Description | ArtInHCI 2023 (International Conference on Artificial Intelligence and Human-Computer Interaction) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Dr Jiangtao Wang took the stage at ArtInHCI 2023 (International Conference on Artificial Intelligence and Human-Computer Interaction) on 28th Oct, delivering a keynote speech on "Epidemiologist-AI Synergistic Intelligence for Population Health." In this talk, Jiangtao shared his insights on how artificial intelligence and human collaboration can effectively address challenges in population health, emphasizing the transformative potential of this approach. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://www.artinhci.com/2023/ |
| Description | Invited talk at Intelligent Health AI 2023 |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Industry/Business |
| Results and Impact | Jiangtao Wang recently showcased his expertise at the Intelligent Health AI summit in Basel on September 13-14, 2023. Wang had the honor of delivering an invited talk titled "Health AI with Constrained Data Collection: Technologies and Applications." During his presentation, Wang addressed the challenges associated with implementing health AI in scenarios where data collection is constrained. He highlighted innovative technologies that offer insights and applications even when faced with limited datasets, a significant consideration in real-world healthcare settings where data privacy and scarcity can be hurdles. The talk resonated with professionals, researchers, and industry leaders present at the summit, sparking valuable discussions on the practical applications and ethical considerations of health AI in constrained data environments. |
| Year(s) Of Engagement Activity | 2023 |
