Intelligent Healthcare Systems for Large-scale Populations

Lead Research Organisation: Newcastle University
Department Name: Sch of Computing

Abstract

This project will be dedicated to developing and applying core AI technologies for general health data sciences. So far, I have developed three workable AI models for detecting infant stroke on a small dataset of accelerometer data collected by the Institute of Neuroscience, Newcastle University. Additionally, I am an expert in deep learning and am experienced at dealing with large-scale complex multi-modal data. My expertise in Zero-shot Learning focuses on addressing model interpretation and data insufficiency problems. I have additional strength in programming and software engineering. The project can be summarised into two main stages. The first stage focuses on proof-of-concept studies on two historical datasets, UK Biobank and NE 85+. UK Biobank has provided baseline measurements (such as the eye measures and saliva samples). In addition to the baseline assessment, 100,000 UK Biobank participants have worn a 24-hour activity monitor for a week, 20,000 of whom have undertaken repeated measures. A programme of online questionnaires is being rolled out (diet, cognitive function, work history and digestive health) and UK Biobank has embarked on a major study to scan (image) 100,000 participants (brain, heart, abdomen, bones & carotid artery). UK Biobank is linking to a wide range of electronic health records (cancer, death, hospital episodes, general practice), and is developing algorithms to accurately identify diseases and their subsets. Blood biochemistry is being analysed (such as hormones & cholesterol). Genotyping has been undertaken on all 500,000 participants and these data are being used in health research. In NE 85+, a total of 484 participants aged 87-89 years recruited to the study completed a purpose-designed physical activity questionnaire (PAQ), which categorised participants as mildly active, moderately active and very active. Out of them, 337 participants wore a triaxial accelerometer on the right wrist over a 5-7-day period to obtain objective measures. Data from subjective and objective measurement methods were compared.

The first stage of the project utilising the NE 85+ data aims to integrate these complex data, e.g. MRI, accelerometer, and electronic health records. The project development will follow a simple-to-complex logistic. Initially, only one disease and one factor will be considered. Subsequently, multiple factors will be simultaneously considered. The model will then be progressively upgraded to take into account different health data sources and the correlations between different diseases. At this stage, we focus on disease diagnosis and alert, i.e. prediction of the risks of diseases based on observed factors. After stable performance has been achieved, the model will be focused on the rationale study. For example, what lifestyle or other factors can result in a high risk of heart disease? What is the attribute that we can change to reduce such a risk? Beside these rationale studies and healthcare feedback, visualisation techniques can provide more qualitative results that can help medical experts to discover new knowledge.

During the second stage, stable AI models can be packaged into apps, with objective of encouraging more participants to engage in the study. Through smartphone or wearable sensors, participants can get access to direct healthcare from the cloud-based AI server. In turn, the collected data will be used to upgrade the model, validate previous studies, and large-scale cohort clinical study. More details can be found in the technical summary.

Technical Summary

The first stage focuses on proof-of-concept studies. After getting access to UK Biobank and NE 85+, the inital step is Data Preprocessing, i.e. integration of complex data, such as MRI, accelerometer, and electronic health records. My previous AI model for infant stroke detection can be employed to explore some preliminary experiments, e.g. associations between human activities and specific diseases. Following this, utilising a multi-modal learning framework, a comprehensive model will be proposed to study the correlations between different health data sources. For example, what MRI pattern can correspond to a disease and how would that relate to the accelerometer data for human activity. Furthermore, attribute learning techniques can provide health feedback, e.g. how to implement a habit that can significantly reduce the risk of a disease. Finally, the model will be upgraded into an inference model and link the predictions to attributes, in a manner that is intuitive to understand.To further improve the interpretations, high-level visualisation technology, e.g. tSNE, Confusion Matrix, will provide furhter guidance about rationales behind the AI model.

During the second stage, proof-of-concept studies will continue to the end of the project. Meanwhile, I plan to implement stable models on apps and smart sensors to encourage more participants to take parts in data collection. For example, stable models can be packaged into apps that can utilise mobile phone-embedded accelerometers as an approximate measurement for data collection and large-scale cohort clinical study. The expected outcomes of the project will be a series of high-quality publications both in AI and health data science fields. In addition, continued grant proposals will be submitted for the continued study of large-scale health data collection using the delivered system.

Publications

10 25 50
publication icon
Angelini F (2020) 2D Pose-Based Real-Time Human Action Recognition With Occlusion-Handling in IEEE Transactions on Multimedia

publication icon
Bond-Taylor S (2022) Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models. in IEEE transactions on pattern analysis and machine intelligence

publication icon
Cai Z (2018) Adaptive RGB Image Recognition by Visual-Depth Embedding. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

publication icon
Cai Z (2019) Classification complexity assessment for hyper-parameter optimization in Pattern Recognition Letters

publication icon
Gao R (2023) Visual-Semantic Aligned Bidirectional Network for Zero-Shot Learning in IEEE Transactions on Multimedia

publication icon
Gao Y (2019) Towards Reliable, Automated General Movement Assessment for Perinatal Stroke Screening in Infants Using Wearable Accelerometers in Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

publication icon
Guan C (2019) Apparel-based deep learning system design for apparel style recommendation in International Journal of Clothing Science and Technology

 
Description supervision
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Semantic Attribute Representation and Feature Synthesis in Zero-shot Learning
Amount ¥63,000 (CNY)
Funding ID NSFC-61872187 
Organisation National Natural Science Foundation of China 
Sector Public
Country China
Start 01/2019 
End 12/2022
 
Title 2d pose-based real-time human action recognition with occlusion-handling 
Description Human Action Recognition (HAR) for CCTV-oriented applications is still a challenging problem. Real-world scenarios HAR implementations is difficult because of the gap between Deep Learning data requirements and what the CCTV-based frameworks can offer in terms of data recording equipments. We propose to reduce this gap by exploiting human poses provided by the OpenPose, which has been already proven to be an effective detector in CCTV-like recordings for tracking applications. Therefore, in this work, we first propose ActionXPose: a novel 2D pose-based approach for pose-level HAR. ActionXPose extracts low- and high-level features from body poses which are provided to a Long Short-Term Memory Neural Network and a 1D Convolutional Neural Network for the classification. We also provide a new dataset, named ISLD, for realistic pose-level HAR in a CCTV-like environment, recorded in the Intelligent Sensing Lab. ActionXPose is extensively tested on ISLD under multiple experimental settings, e.g. Dataset Augmentation and Cross-Dataset setting, as well as revising other existing datasets for HAR. ActionXPose achieves state-of-the-art performance in terms of accuracy, very high robustness to occlusions and missing data, and promising results for practical implementation in real-world applications. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact We propose the first human action recognition tool based on pose estimation. The model is computationally efficient and robust to occlusion. A new dataset ISLD is published for future study. The proposed framework can help understand the activities of a large-scale population via public CCTC cameras. 
URL https://ieeexplore.ieee.org/abstract/document/8853267
 
Title From Zero-shot Learning to Supervised Learning 
Description Existing machine learning approaches are highly relying on supervised training, i.e. images are needed to be extensively labelled by human annotators. Robust object recognition systems usually rely on powerful feature extraction mechanisms from a large number of real images. However, in many realistic applications, collecting sufficient images for ever-growing new classes is unattainable. In this paper, we propose a new Zero-shot learning (ZSL) framework that can synthesise visual features for unseen classes without acquiring real images. Using the proposed Unseen Visual Data Synthesis (UVDS) algorithm, semantic attributes are effectively utilised as an intermediate clue to synthesise unseen visual features at the training stage. Hereafter, ZSL recognition is converted into the conventional supervised problem, ie the synthesised visual features can be straightforwardly fed to typical classifiers such as SVM. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact The conference+journal publication has received 60 citations. On four benchmark datasets, we demonstrate the benefit of using synthesised unseen data. Extensive experimental results manifest that our proposed approach significantly improve the state-of-the-art results. 
URL https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8066319
 
Title Fuzzy Interpolative Reasoning 
Description Collecting medical records for training AI models is time-consuming and expensive. For example, clinicians are required to label millions of positive and negative cancer imaging so that the machine can learn to predict by a supervised learning scheme. The fuzzy interpolative reasoning can utilise historical data and infer the associations of human annotaions so as to speed up the labelling procedure. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact Fuzzy Interpolative Reasoning (FIR) algorithm that can discover inter-class associations from light-weight Simile annotations based on visual similarities between classes. The inferred representation can better bridge the visual-semantic gap and manifest state-of-the-art experimental results. 
URL http://bmvc2018.org/contents/papers/0303.pdf
 
Title Occluded Facial Recognition System 
Description Facial recognition is useful to analyse large-scale human behavior and mental health-related problems. Existing approaches can suffer from occlusion problems. Also, clinical data can be very small which that existing deep learning models are not capable to deal with. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact My method is evaluated on two face recognition benchmarks. Experimental results suggest that our method leads to a remarkable margin of performance gain over the benchmark techniques. 
URL https://www.sciencedirect.com/science/article/pii/S0020025516317637
 
Title Zero-shot Video Analysis 
Description Current medical records have many video-based data. However, the big data is not ready to be utilised due to the requirement of manual annotations. The data privacy issue, i.e. GDPR, also limits the access to these critical data. My new technology aims to use generic training data, e.g. movies, sports, to recognise new video actions never seen before. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact Predicted UR exemplars can be improved by a simple semantic adaptation, and then an unseen action can be directly recognised using UR during the test. Without further training, extensive experiments manifest significant improvements over the UCF101 and HMDB51 benchmarks. It fundamentally addressed the `data hunger' problem in applying deep learning to healthcare without sufficient human annotations. 
URL http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhu_Towards_Universal_Representation_CVPR_2018...
 
Title Intelligent Sensing Lab Dataset (ISLD) 
Description We provide a new dataset, named ISLD, for realistic pose-level HAR in a CCTV-like environment, recorded in the Intelligent Sensing Lab. ActionXPose is extensively tested on ISLD under multiple experimental settings, e.g. Dataset Augmentation and Cross-Dataset setting, as well as revising other existing datasets for HAR. ActionXPose achieves state-of-the-art performance in terms of accuracy, very high robustness to occlusions and missing data, and promising results for practical implementation in real-world applications. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact Human Action Recognition (HAR) for CCTV-oriented applications is still a challenging problem. Real-world scenarios HAR implementations is difficult because of the gap between Deep Learning data requirements and what the CCTV-based frameworks can offer in terms of data recording equipments. We propose to reduce this gap by exploiting human poses provided by the OpenPose, which has been already proven to be an effective detector in CCTV-like recordings for tracking applications. Therefore, in this work, we first propose ActionXPose: a novel 2D pose-based approach for pose-level HAR. ActionXPose extracts low- and high-level features from body poses which are provided to a Long Short-Term Memory Neural Network and a 1D Convolutional Neural Network for the classification. 
URL https://ieeexplore.ieee.org/abstract/document/8853267
 
Title Perinatal Stroke Dataset 
Description Perinatal stroke (PS) is a serious condition that, if undetected and thus untreated, often leads to life-long disability, in particular Cerebral Palsy (CP). In clinical settings, Prechtl's General Movement Assessment (GMA) can be used to classify infant movements using a Gestalt approach, identifying infants at high risk of developing PS. Training and maintenance of assessment skills are essential and expensive for the correct use of GMA, yet many practitioners lack these skills, preventing larger-scale screening and leading to significant risks of missing opportunities for early detection and intervention for affected infants. We present an automated approach to GMA, based on body-worn accelerometers and a novel sensor data analysis method--Discriminative Pattern Discovery (DPD)--that is designed to cope with scenarios where only coarse annotations of data are available for model training. We demonstrate the effectiveness of our approach in a study with 34 newborns (21 typically developing infants and 13 PS infants with abnormal movements). Our method is able to correctly recognise the trials with abnormal movements with at least the accuracy that is required by newly trained human annotators (75%), which is encouraging towards our ultimate goal of an automated PS screening system that can be used population-wide. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact The first Automatic Perinatal Stroke Dataset for General Movement Analysis using wearable sensors. 
URL https://dl.acm.org/doi/10.1145/3314399
 
Description Biometric Authentication and Visual-Semantic Interaction System 
Organisation Chinese Academy of Sciences
Country China 
Sector Public 
PI Contribution I visited the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences during 31/07/2018 - 30/08/2019. I collaborate with their researchers and students on machine learning and biometric technologies in order to apply these advanced algorithms to human-computer interaction system. I also participated in the organisation of ICPR conference (International conference of pattern recognition) hosted by NLPR.
Collaborator Contribution My closest partner is Dr Yan Huang. He implemented a visual-language interaction system but found the performance is severely affected by rare words. I proposed a zero-shot learning approach and he implements it into the system and massively improved the results.
Impact We have co-authored and published a topic venue AAAI conference paper: Few-Shot Image and Sentence Matching via Gated Visual-Semantic Embedding. The project enables the AI can interact with free natural language with unusual words. Unusual words are highly challenging to be recognised in previous methods.
Start Year 2018
 
Description Semantic Attributes and Feature Synthesis approaches in Zero-shot Learnintg 
Organisation Nanjing University of Science and Technology
Country China 
Sector Academic/University 
PI Contribution My fellowship aims to address the large-scale healthcare problem by AI approaches. As stated in my proposal, the first year aims to examine pilot systems that can robustly handle urgent data science challenges. Zero-shot learning is an emerging technique that can deal with problems when data is noisy, incorrectly collected or even missing. My role in this collaboration is to summarise theoretical models, derivating equations for optimisation, and supervise the progress of each project.
Collaborator Contribution My partner Prof Zhang is the PI of this project. He leads the machine learning team based at Computer Science and Engineering School, Nanjing University of Science and Technology. He and his team implement my theoretical models, applying them on computer vision datasets and report progress to me.
Impact We have co-authored 7 papers and published on top journals and conferences in AI: 1) Zero-shot leaning and hashing with binary visual similes 2) Attribute relaxation from class level to instance level for zero-shot learning 3) Adversarial unseen visual feature synthesis for Zero-shot Learning 4) Triple Verification Network for Generalized Zero-Shot Learning 5) Zero-shot hashing with orthogonal projection for image retrieval 6) Dual-verification network for zero-shot learning 7) Unsupervised deep hashing with pseudo labels for scalable image retrieval These theoretical studies include hashing (for scalable data representation), image retrieval (for data management), attribute learning (for reasoning and interpretable AI), data synthesis (for to address data hunger problems). These applications have made impacts to large audiences of AI and machine learning engineers. Our recent medical application from these algorithms is published in: Towards Reliable, Automated General Movement Assessment for Perinatal Stroke Screening in Infants Using Wearable Accelerometers, IMWUT 2019.
Start Year 2018
 
Description The Inception Institute of Artificial Intelligence 
Organisation Inception Institute of Artificial Intelligence
Country United Arab Emirates 
Sector Private 
PI Contribution Collaboration in machine learning and AI models. Research outcomes are applied to clinic research and AI education in Mohamed bin Zayed University of Artificial Intelligence.
Collaborator Contribution Model and theory derivation and ablation study.
Impact Modality independent adversarial network for generalized zero-shot image classification A probabilistic zero-shot learning method via latent nonnegative prototype synthesis of unseen classes
Start Year 2019
 
Title Infant Stroke Detection using wearable sensors 
Description Perinatal stroke (PS) is a serious condition that, if undetected and thus untreated, often leads to life-long disability, in particular Cerebral Palsy (CP). In clinical settings, Prechtl's General Movement Assessment (GMA) can be used to classify infant movements using a Gestalt approach, identifying infants at high risk of developing PS. Training and maintenance of assessment skills are essential and expensive for the correct use of GMA, yet many practitioners lack these skills, preventing larger-scale screening and leading to significant risks of missing opportunities for early detection and intervention for affected infants. We present an automated approach to GMA, based on body-worn accelerometers and a novel sensor data analysis method-Discriminative Pattern Discovery (DPD)-that is designed to cope with scenarios where only coarse annotations of data are available for model training. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2019 
Impact We demonstrate the effectiveness of our approach in a study with 34 newborns (21 typically developing infants and 13 PS infants with abnormal movements). Our method is able to correctly recognise the trials with abnormal movements with at least the accuracy that is required by newly trained human annotators (75%), which is encouraging towards our ultimate goal of an automated PS screening system that can be used population-wide. 
URL https://arxiv.org/pdf/1902.08068.pdf
 
Description Invited talk in machine learning teaching 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact The talk is mainly for MSc students. After the talk, students learn how to apply machine learning techniques for healthcare problems from our on-going project demos. Many of them expressed interest in participating in our research and I got 2 co-supervised MSc projects from them. One is about infant stroke detection using wearable sensors. The other is about human activity recognition using accelerometers.
Year(s) Of Engagement Activity 2018
 
Description Tuspark Public Education 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Tuspark is a local accelerator and incubator company at Newcastle. Many industrial companies are based at their office including bioinformatics, public health and medical companies, and other IT companies. Audiences are mainly software engineers who are eager to learn new AI technologies for their daily applications. We organise Machine Learning educations in serials to teach both essential AI skills and advanced techniques from our recent publications and research. Local undergraduate and postgraduate students are also welcomed.
Year(s) Of Engagement Activity 2018
URL https://www.tuspark.co.uk/newcastle