Lung function trajectories from birth to school age in African children, and their early life determinants

Lead Research Organisation: Imperial College London
Department Name: National Heart and Lung Institute

Abstract

Lung diseases are a major cause of ill health and premature death globally, with a particularly large burden in Africa. Asthma and chronic obstructive pulmonary disease (COPD) are very common, and COPD is the third biggest killer in Africa. African patients develop more severe asthma and COPD, and at a younger age, compared to the rest of the world.

Low level of lung function in young adult age is an important risk factor for the development of COPD. Furthermore, low lung function increases the likelihood of early death from all causes as early as the third decade of life. Studies from high-income countries have shown that lung function tracks from school age to old age, and that lung function in early childhood is an important determinant of COPD in adulthood. Childhood asthma and lower respiratory tract infections in early life reduce lung function through childhood. The burden and the type of respiratory infections and asthma, as well as environmental factors which adversely affect development of lung function (such as cigarette smoke, biomass exposure, allergens or psychosocial stressors), are markedly different in Africa compared to the high-income countries. However, despite the high frequency and severity of childhood asthma, high incidence of respiratory infections, and many harmful environmental exposures, to date there are no data on early-life factors associated with poor lung function and its trajectory in Africa.

We have shown that there is a scope to intervene in early childhood to improve lung function, and reduce long-term consequences of low lung function in childhood. In order to develop interventions to reduce the risk of low lung function in the needy population in Africa, we have to identify childhood lung function trajectories, and discover their early life environmental determinants, which are specific to this part of the world. This information is crucial to develop novel preventative strategies in this part of the world, which will impact on COPD, and the other adverse consequences of diminished lung function.

Our overall aim is to investigate lung function trajectories in African children from birth to 8 years of age, and to identify early-life risk factors associated with low lung function trajectory. We will focus on early-life exposures and respiratory outcomes during childhood, as the trajectory of long-term lung health is established in early life. To achieve this, we will build on the unique South African birth cohort of 1000 mother-child pairs, with detailed measures of infectious diseases and non-infectious exposures through pregnancy and childhood, extensive biobank of samples, and longitudinal measurements of lung function and respiratory diseases from birth to age 5 years. In the course of this project, we will measure lung function and the progression or severity of clinical symptoms through to age 8 years, with ongoing collection of environmental exposures. This will extend the number of time points at which clinical symptoms and lung function have been measured in an identical way, and lay foundations for longitudinal analyses, thereby creating a unique resource unparalleled anywhere in low-income and middle-income countries.

We will bring together leading UK and South African experts, to investigate lung function trajectories in South African children from birth through age 8 years, and determine early life exposures which lead to the low lung function. We will build African research capacity through collaborations and training between South African and UK expert groups. Our overall vision is to inform the development of intervention strategies to reduce the risk of low lung function trajectories during the growth phase, and prevent multi-organ morbidity and premature death in African populations, an area of critical need.

Technical Summary

Early-life factors are crucial for lung function growth, and subsequent COPD pathogenesis. Childhood asthma and early-life LRTIs reduce lung function through childhood, leading to diminished lung function at the physiological plateau in the third decade of life. The burden and pattern of LRTIs and asthma in Africa, as well as environmental factors which affect developmental trajectories of lung function, are markedly different than in high-income countries. However, despite the high prevalence and severity of childhood asthma, high incidence of LRTIs, and many harmful environmental exposures, to date there is no data on early-life factors associated with diminished lung function in Africa.

We propose to extend our work in the Drakenstein Child Health Study (DCHS), a birth cohort of 1000 South African children who have been followed from antenatal period through early childhood, to address early-life determinants of lung function in African population. Unique aspect of the study is that lung function measurements have been taken longitudinally from age 6 weeks. We collected detailed information on LRTI (including aetiology) and environmental exposures. We will bring it together with the expertise in data mapping and analytical approaches in the UK STELAR consortium. We will extend the follow-up in DCHS through age 8 years, and carry out detailed measurement of lung function. This will extend the number of time points at which clinical outcomes and lung function have been measured in an identical way, and lay foundations for longitudinal analyses, thereby creating a unique resource unparalleled anywhere in LMICs. We will identify trajectories of lung function in African children from birth through age 8 years, and determine risk factors for low lung function trajectory, to inform the development of intervention strategies to reduce the risk of low lung function and prevent multi-organ morbidity and premature death in Africa.

Planned Impact

Who might benefit from this research?

The proposed project will multiply the effects of previous investments, thereby having an overall scientific impact much greater than its level of requested funding. We will provide an infrastructure for large scale interdisciplinary collaborations to conduct cutting edge science, using existing and newly collected data resources, to produce health benefit for the African population, and broader. Respiratory diseases pose a particularly large burden in Africa. African patients develop more severe asthma and COPD, and at a younger age, compared to the rest of the world. The African population therefore represents an invaluable resource to identify early life factors associated with poor respiratory outcomes. A recent Editorial in the Lancet Global Health emphasised that "health-care policy makers in Africa need to take notice of the silently growing epidemic of COPD and start taking measures to both prevent and treat COPD effectively, before it gets out of hand" (PMID: 25539971). The results of this project are intended to lead to the development of methods for prevention of chronic respiratory diseases, early mortality and premature death, which could be generalisable to other populations.

Our results will identify risk factors and mechanisms that influence the onset and progression of diminished lung function trajectories, and subsequent respiratory disease, thereby identifying pathways that may provide information for targeted interventions to reduce the impact of childhood asthma and adult COPD in African populations. The discovery of risk factors for diminished lung function in Africa will form the basis for identification of novel interventions, as well as biomarkers which are predictive of health or disease, and is intended to allow life-style choices to be made to prevent long-term adverse health outcomes. This will be of great value to patients, society, health-care professionals and industry.

How might they benefit from this research?

The ability to access shared analysis resources in South Africa and the UK will be of great value for training and development of researchers, and the ability to access example analyses and expert advice will reduce their learning curve. Enabling the networking of datasets, expertise and methods for data preparation and analysis can help drive greater value from existing investments. Building South African capacity in statistical methodologies applied to longitudinal measures, and in novel analytical methods including latent class analysis and machine learning approaches, will ensure knowledge and skills transfer to clinicians and researchers in South Africa.

STELAR investigators will have access to a unique collection of well characterised birth cohorts with fundamentally different environmental exposures, and capitalise on the heterogeneity of the collected data to gain insights into pathophysiology of respiratory diseases. To further replicate our findings, we have established a collaboration with the US CREW consortium of 12 US cohorts, funded through the NIH ECHO Paediatric Cohorts initiative. CREW uses STELAR eLab to harmonise and integrate their data, which we have provided using the same open-source model which we propose for DCHS. This knowledge management platform will be further developed in a collaborative manner across all study sites at different continents, providing a global platform to investigate respiratory diseases.

Our findings will represent potentially valuable intellectual property, which we will seek to commercialise in collaboration with companies invested in diagnostics and/or therapeutics. Participating universities have mechanisms and structures in place for exploring industrial applications. Partnerships such as the one described in this application help to make the UK an attractive location to retain research activities, and help expose academics to the process of translating science into products.
 
Description Substantial progress has been made in developing models of wheezing phenotypes in children from birth through 5 years of age; lung function has been used to validate these. The focus has been on developing latent class models of wheezing phenotypes a priori. In this coming year, models for lung function trajectories will be further developed including for different, comprehensive measurements of lung function, with completion of lung function measurements that had to be postponed due to COVID.

Comprehensive lung function measurements have been done longitudinally from the age of 6 weeks, and annually thereafter, with ongoing testing from 6 to 8 years as part of this grant. Comprehensive lung function measures in unsedated children include measurements of tidal volumes and expiratory flow ratios, lung clearance index, forced oscillation technique (FOT), ventilation homogeneity and exhaled nitric oxide, and spirometry (in children from 5 years of age) providing a unique dataset.

A key delay has been the inability to obtain lung function during COVID from March 2020 to Jan 2021, due to the lockdown, COVID infection control precautions and additional operational issues in the context of COVID. We have recently begun lung function testing again, with strict infection control precautions and careful booking of participants. We therefore propose an 18 month no cost extension to obtain the remaining 6- 8 year lung function measurements, analyse these and undertake latent class modelling of lung function trajectories.

Lung function testing has been completed in most participants at 6 weeks (n=910), 1 year (n=784), 2 years (n=741), 3 years (n=768), and 4 years (n=809). Testing is ongoing in 5-8 years: currently 749 at 5 years; 413 at 6 years; 123 at 7 years; 13 at 8 years, which has now begun. We've had high rates of testing completion, Table 1. We're now targeting testing for the 8 year visit and for those at 6 years who missed their 5 year visit. Catching up on the collection of this lung function data is core to completing comprehensive phenotyping longitudinally from birth through age 8 years to meet these study aims.

Excellent progress has been made in developing models of wheezing phenotypes in children from birth through 5 years of age. A doctoral student was appointed and has been closely working with the UK group to develop models, analyse this data and prepare the data for publication. Four distinct wheezing phenotypes have been identified as well as factors associated with each, providing unique data in a LMIC context. The original proposal was for the student to initially spend time in the UK with direct supervision and training, while working closely here with the UK-Stelar consortium, but this has not been possible due to COVID restrictions. However through regular online meetings and contact, much progress and work has occurred.

Comprehensive analyses of early life risk factors associated with wheezing phenotypes has been done with ongoing longitudinal measures as outlined below:
Tobacco smoke exposure and indoor air pollution: Exposure to indoor air pollution (particulate matter, carbon monoxide and volatile organic compounds) has been measured antenatally and postnatally with devices placed in homes. Tobacco smoke exposure has been measured through maternal self-report, and by urine cotinine antenatally (in mothers) and postnatally in children (at birth, annually, and at the time of LRTI); we have continued to measure maternal smoking and child exposure annually and at the time of LRTI.
Microbiome: We have completed analysis of the nasopharyngeal microbiome in children through the first year of life using 16sRNA analysis, as well as targeted approaches (e.g. culture of respiratory samples, multiplex PCR), on samples obtained 2 weekly from infants through this period. A second PhD student has been working on these. These data will enable us to investigate associations between diversity, composition and changes in the early respiratory microbiome and the development of wheezing or lung function trajectories, which is being undertaken by a 2nd PhD student.
Growth and nutrition: We have continued to measure growth including bioimpedance measures, height, weight, mid-upper arm circumference and skin fold thickness annually and at intercurrent visits. A third PhD student is working on models of growth trajectories from birth through childhood, and will enable us to investigate the association of growth with wheezing phenotypes and lung function trajectories.
Psychosocial measurements: Measurement of maternal and child mental health (including maternal depression/anxiety, alcohol/drug use, exposure to psychosocial stressors, and childhood adversity) are ongoing, although these too were impacted by COVID. Therefore in a NCE over the next 18 months, we plan to catch up on missed measurements, enabling us to investigate the association between exposure to psychosocial stressors, other factors and wheezing or lung function trajectories in children.
Measures of atopy: Blood samples for measurement of IgE and allergen specific IgE have been taken and stored in our biorepository from most children. Testing of these has been delayed due to COVID; current laboratory testing capacity is extremely limited due to the huge need for COVID diagnostic testing. We therefore propose to maintain samples in our biorepository (which is well curated and monitored) until there is laboratory capacity to undertake testing. We anticipate this will be towards the end of 2021. We therefore request a NCE to enable us to also undertake this aspect of the work.

A paper entitled "Wheezing phenotypes in a South African birth cohort study and early life determinants" has been prepared and will be submitted shortly to the Lancet.

We have processed samples for genotyping on all mother and child participants using the Global Screening Array; this data will be available to investigate polygenic risk scores associated with wheezing or lung function trajectories. RNA samples have been collected annually from birth to age 8 years; we have obtained NIH funding to measure RNA expression in a sub-sample of the DCHS.

In summary, there has been good progress with high cohort retention over this period; the results of wheezing phenotypes and lung function are novel and of direct relevance to child health especially in LMICs. However COVID has impacted directly on the ability to do key measurements and follow-up of the cohort.
Exploitation Route Better understanding of wheezing illness in low and middle income countries.
Sectors Healthcare

 
Description The development of the lung function team is a further example of such capacity development - formerly there was no capacity for doing child lung function in such a context, but the systems established have paved the way for models in other LMIC settings. The collaboration with Imperial College and the UNICORN consortium has been fundamental to capacity development, in enabling the study aims and the local research team to develop much needed statistical and analytical skills. Development of models for wheezing phenotypes has built on the extensive expertise and experience of the UK team, working together with the local team to undertake this work. Finally comparison of findings with those from the 5 UK cohorts is planned, to better understand differences and determinants of lung health in children in LMICs compared to those in high income settings, for which a NCE is also needed. This is key to developing improved strategies to strengthen child health in LMIC contexts.
First Year Of Impact 2022
Sector Healthcare
Impact Types Societal,Policy & public services

 
Description EAACI guidelines on environmental science in allergic diseases and asthma
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a guidance/advisory committee
 
Description European Academy of Allergy and Clinical Immunology (EAACI) Strategic Forum
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
Impact The European Academy of Allergy and Clinical Immunology (EAACI) organized the first European Strategic Forum on Allergic Diseases and Asthma. The main aim was to bring together all relevant stakeholders and decision-makers in the field of allergy, asthma and clinical Immunology around an open debate on contemporary challenges and potential solutions for the next decade. The Strategic Forum was an upscaling of the EAACI White Paper aiming to integrate the Academy's output with the perspective offered by EAACI's partners. This collaboration is fundamental for adapting and integrating allergy and asthma care into the context of real-world problems. The Strategic Forum on Allergic Diseases brought together all partners who have the drive and the influence to make positive change: national and international societies, patients' organizations, regulatory bodies and industry representatives. An open debate with a special focus on drug development and biomedical engineering, big data and information technology and allergic diseases and asthma in the context of environmental health concluded that connecting science with the transformation of care and a joint agreement between all partners on priorities and needs are essential to ensure a better management of allergic diseases and asthma in the advent of precision medicine together with global access to innovative and affordable diagnostics and therapeutics.
URL https://eaaci.org/about-eaaci/advocacy/#latest-statement-on-covid-19
 
Description Early Life Exposures And Development Of Non-communicable Diseases In Adolescence: The Drakenstein Child Health Study
Amount £2,128,448 (GBP)
Funding ID MR/W028352/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 09/2022 
End 09/2027
 
Description Predicting asthma presence, development and persistence: The Drakenstein Child Health Study (DCHS)
Amount £70,000 (GBP)
Organisation Thermo Fisher Scientific 
Sector Private
Country United States
Start 01/2021 
End 06/2022
 
Title FAIR-ified data sets 
Description The main goal was to develop the informatics solutions allowing the management, integration and harmonisation of heterogenous data sources. This would be achieved by making them interoperable and fit for re-use on a larger scale beyond the remit of each individual study. This implicit goal became more obvious as the FAIR Data Principles became widely accepted and adopted by the research community. This approach is a two-staged process where first sourced data is consolidated into semantically annotated FAIR datasets as the target state for data. Mapper files are created for ach data source against which the data gets FAIRified by software. These FAIR datasets are Interoperable, and richly annotated datasets that allow future users to discover and re-use for different purposes. The second step is to integrate and harmonise data across these semantically annotated FAIR datasets and load them into an integrated data model that allows cross-study data exploration and analysis. These two states of data (1) Structured and semantically annotated FAIR datasets and (2) Cross-study integrated data are both stored separately in two databases but linked and interconnected as part of the overall UNICORN FAIR Data Platform. eLab's internal data model is formed of pre-defined templates that correspond to different data types generated from the STELAR birth cohorts. In total there are 57 templates covering clinical and questionnaire data. Each template consists of a number of parameters that correspond to the different data variables measured or observed about the subject for that particular data type. For each submitted data file, a metadata (descriptor) file was created by the respective study data manager to annotate all data files generated by the study. For each imported dataset, a descriptor file was created to annotate each column in the dataset to its respective template and parameter. When data were imported into the eLab, all the data were organised into Variable Reports. A Variable Report co-locates a given variable with its metadata according to the relevant template listed in the descriptor file. Each Variable Report becomes a self-explanatory data object, containing common elements (such as age in weeks and variable descriptions) and semantics necessary for analysis. Common templates and parameters for clinical data supported alignment of these data across the different studies. However, the templates and parameters for questionnaire data aimed to capture the data as collected, and alignment of questions with similar semantics across studies was only supported through tagging. We are building on the STELAR data by aligning the questionnaire data, representing these using common and semantically rich models as part of a FAIRification process. For clinical data (Tests Data) "Mapper Files" were manually created for each data type (e.g., spirometry, reversibility, bronchial challenges, skin allergy tests...etc) mapping eLab templated data to their respective CDISC domain specification. For 'Questionnaire Data', we developed a different pipeline, which involved creating mapper files that mapped the original questionnaire data directly to the observation-based semantic model. There are a total of 34 datasets covering clinical and questionnaire for Breathing Together (BT). Each template consists of a number of data fields that correspond to the different data variables measured or observed about the subject for that particular data type. The level of FAIR data maturity required for the goal of UNICORN meant we had to standardise and structure data from all BT study and not just annotate the variables. The focus therefore was not on building another Extract, Transform, Load (ETL) process that would migrate data from one database to another, but on creating a FAIRification process that would produce structurally defined and semantically annotated FAIR datasets. Consequently, such datasets are Interoperable and Re-usable for longer term. Similar to Questionnaire data from STELAR birth cohorts, we decided to use the observation-based semantic model to annotate patient cohort data from BT. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact The mapping process from the Breathing Together (BT) data to UNICORN FAIR Datasets is implemented using the dataset metadata specifications. This is a similar process to the one adopted for STELAR birth cohort data mappings. Data exports from InForm were individually mapped with the help of the annotations provided by the Study Annotation Book, which is a PDF document that provided metadata descriptions for each InForm Dataset and its related fields. This was a manually laborious process to create a mapper file for each dataset, where each data field in the original data was manually mapped to an observation-feature model as prescribed by the Biomedical-observation semantic model. These mapper files and their respective original data files are loaded into a software developed by the DSI-ICL team to transform the data and create the FAIRified UNICORN datasets. 
 
Title UNICORN FAIR Data Platform 
Description The ICL-DSI UNICORN Data Repository (now the UNICORN FAIR Data Platform) was designed and developed as a full-stack web application with a server-based (back-end) application exposing an API layer that communicates with a client-based (front-end) application. The back-end is a .NET WebAPI application designed according to the multi-layered onion architecture . The front-end comprises a web application based on an angular framework providing end user accessibility to the application. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact Research data management according to the FAIR (Findability, Accessibility, Interoperability, and Reusability) data principles is a data-science-driven data management which aims to enable efficient and error-free data analysis from multiple sources. Since the initiation of the FAIR principle in 2016, FAIR metrics, FAIR infrastructure, and FAIR tools have been developed to aid in making data FAIR ("FAIRification" process). Importantly, data management according to the FAIR principles is becoming expectation of the major funding bodies and publishers. The FAIR approach to data management means that research data is well described, preserved, and enabled for long term use and re-purposing. One of the key advantages of FAIR data is a major increase in reusability beyond the first and original purpose. 
 
Title UNICORN eLab 
Description The UNICORN eLab has been established as part of this project. This involved developing a new FHIR database that is used to manage the STELAR FHIR data. This has allowed the migration of data from a proprietary system to one based on open standards that are strongly supported by an international community. The pre-existing STELAR eLab has been re-architected to offer significant improvements. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact Ease of deployment, upgrades and maintenance Extensibility Auditability Confidentiality/ Security Availability 
 
Title D11 ISO11179 compliant MDR web application for metadata management 
Description A sub-module in the UNICORN Data Platform to store and serve the standard dataset templates that different UNICORN datasets are mapped and transformed into. Shifting to the FAIRification of datasets meant we had to manage dataset templates as a whole and not individually managed Common Data Elements, which would need a Metadata Data Registry to store and manager them. Therefore, we implemented this feature into the Metadata Governance Module, to store the standard dataset templates and a user interface that enables UNICORN data manager to associate the various datasets imported into the platform with their respective dataset templates for validation and quality checking. 
Type Of Material Data handling & control 
Year Produced 2022 
Provided To Others? Yes  
Impact Enables UNICORN data manager to associate the various datasets imported into the platform with their respective dataset templates for validation and quality checking. 
 
Title Modelling evolution of eczema, wheeze and rhinitis from infancy to early adulthood 
Description We used different temporal and analytic frameworks, including longitudinal sequence mining, and Latent Markov Modelling (LMM) to investigate the patterns of the natural progression of eczema, wheeze, and rhinitis from infancy to adolescence/adulthood, and to model disease transitions, using data from four UK birth cohort studies in the STELAR/UNICORN consortium. Cross-sectionally, single diseases were much more prevalent at each time point than multi-morbidity. This demonstrated for the first time the vast heterogeneity and complexity in individual-level sequences (. All methods led to similar conclusions, including the observation that most children with early-life eczema did not develop wheeze and/or rhinitis, and that very few followed a sequence described as the "atopic march". 
Type Of Material Data analysis technique 
Year Produced 2022 
Provided To Others? Yes  
Impact • We have shown that there is a considerable heterogeneity in the sequence of development of atopic diseases through childhood using different temporal and analytic frameworks. • We described a potentially important cluster of multimorbidity. The "atopic march" sequence likely exists within the multimorbidity cluster, but as one of many other sequences. • Only 20% of children with early-life eczema progressed to multimorbidity, i.e. most children with eczema in early life do not develop subsequent multimorbidity. 
 
Description The Children's Respiratory and Environmental Workgroup (CREW) birth cohort consortium 
Organisation University of Wisconsin-Madison
Country United States 
Sector Academic/University 
PI Contribution Provision of Asthma eLab as an open source software to our US collaborators.
Collaborator Contribution Upgraded eLab (to FIHR standard) made available to the UNICORN consortium
Impact Joint GWASs currently under way
Start Year 2022
 
Description UNICORN (Unified Cohorts Research Network) 
Organisation University of Manchester
Department Manchester Museum
Country United Kingdom 
Sector Academic/University 
PI Contribution PI
Collaborator Contribution MAAS cohort
Impact MRC Programme Grant MR/S025340/1
Start Year 2020
 
Description Dairy, yeast, pollen, nuts, dander; Imperial Magazine 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact It's easy to dismiss allergy as just another trend. But as my work demonstrates, that could not be further from the truth.
Year(s) Of Engagement Activity 2022
URL https://www.imperial.ac.uk/Stories/dairy-yeast-pollen/
 
Description N1 TV interview (CNN affiliate), Sarajevo BiH; 22/02/2023 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact TV Interview
Year(s) Of Engagement Activity 2023
 
Description Surveillance for SARS-CoV2 and for COVID illness in Drachenstein cohort 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Study participants or study members
Results and Impact We have kept close contact with participating families throughout this time for surveillance for SARS-CoV2 and for COVID illness. This ensured that mother-child pairs are closely followed through the pandemic, and that the impact of the pandemic can be evaluated in our cohort. This has also provided an opportunity for community engagement to promote public health messages around COVID prevention (e.g. social distancing, masking, self-isolation when symptomatic), coupled with socially responsive work such as providing masks for all members of families and poverty alleviation initiatives.
Year(s) Of Engagement Activity 2020,2021