Supersaturated Multi-Stratum Designs: Construction and Statistical Modelling

Lead Research Organisation: University of Southampton
Department Name: Statistical Sciences Research institute

Abstract

Design of experiments (DOE) is one of the most important tools in scientific research, and it is an important aid in innovation in industries and sciences (applications can, for example, be found in the food industry, materials science, robotics, medicine and biology). Designing the experiment appropriately allows investigators to improve the efficiency of their methods and maximize the information obtained from their experiments. Failure to pay attention to the experimental design can lead to a waste in resources, needless repetition and poor inference. Better designed experiments will lead to reductions in the number of participants involved in clinical trials, savings in the amount of expensive components needed, decreases in the number of prototypes required in engineering experiments, and shorter product and process development times.

However, there is an important gap in the state of the art knowledge in DoE. Consider, for example, the following scenario: in a tribocorrosion experiment, the experimenter wants to optimise DLC (diamond like carbon) coatings for use in orthopedic implants. The experiment involves 6 variables, included the coating structure, the interlayer and the substrate with 3 levels each. Different level combinations of these variables will give different coatings. However, in order to get a fully randomized experiment you need to reset these levels each time, something which is very costly and time consuming. So ideally these will be kept constant for a sequence of runs. In addition, there is 1 easy- to-change variable (immersion time). The experimenter wants to study the main effects of the 7 variables and their interactions. Three samples for every condition are needed to perform all the tests and at most 30 samples can be immersed inside the used bath, whereas the number of effects that require estimation is greater than 30. In this situation, there is not only no efficient design available, but the notion of an efficient design in this context has not even been defined in a meaningful manner. In addition, novel methodology will be needed to analyse the data in order to draw the correct conclusions. There is no guidance for practitioners on how to plan and to analyse such an experiment, potentially slowing down the scientific progress in application areas. The proposed research will fill this important gap. Our approach will provide a new general methodology for setting up and analysing informative experiments, with both restricted randomization (multi-stratum designs) and a large number of factors, larger than the number of observations (supersaturated designs). The class of multi-stratum supersaturated designs is a very recently explored research area and there is a lack of a general methodology to tackle the problem of the construction and analysis of these experiments. Multi-stratum designs are very effective in reducing the cost of an experiment in the presence of hard-to-change factors and/or of multi-stage processes. In addition supersaturated designs (SSDs) compose a large class of factorial designs, which can be used for screening out the important factors from a large set of potentially active ones.

We will develop new Bayesian optimality criteria for designing good experiments, and Bayesian modelling tools to analyse data from supersaturated multi-stratum experiments. This will be interesting and challenging from a methodological point of view, and will also increase scientific understanding in numerous application areas, such as materials and surface engineering, tribological and chemical experiments where similar problems arise. Applying our methodology in real world scenarios will lead to important synergies between the different applied disciplines and Statistics. In particular, it will deliver insights into these important areas, demonstration of the effectiveness of the methodology, and exemplars to aid in dissemination.

Planned Impact

The design of multi-stratum supersaturated experiments is an important unexplored research area. Mixed effects models are increasingly being used in many scientific disciplines (e.g. biology and medicine), and so the project will not only have a large impact on statistics, but also on scientific research in many disciplines. Thus, the results of this project will have an immediate and sustainable benefit for national and international science, business and industry.

Design of experiments (DOE) is a well-established important area of research in the UK with applications to many important scientific fields such as materials science, pharmacology, and chemistry. This project will contribute to advancing the field in this area in a new and important direction. DOE has for a long time been a strong area of research and applications in UK, compared with the rest of the world. This project will contribute to the maintenance of this traditionally successful national area. DOE will certainly have academic beneficiaries in other disciplines as well as the economic impact of these cost efficient designs studied here.

The PI has already completed successfully an FP7-PEOPLE-2009-IEF project on "Design of experiments for variance component estimation". The techniques developed by the PI and her collaborators can be applied successfully to split-plot designs, which are a special case of multi-stratum designs. Based on the work undertaken in the previous project we have evidence to believe there is a definite benefit to investigating the proposed methods.

To maximise the impact of research, the results will be made available on open-access repositories including the arXiv, prior to their publication in leading international journals. The programmes for the construction of the optimal designs and the statistical analysis methods will be available on the PI' s web page. The PI and PDRA will also publicise the results by giving seminars and colloquia at universities and conferences, both nationally and internationally. The PI will discuss with Dr. Bradley Jones, Principal Research Fellow in the Research and Development team of the JMP division of the SAS institute, the possible application of the developed methodology to the statistical software JMP. This would be an excellent opportunity for establishing a Knowledge Transfer Partnership (more information online at: http://www.ktponline.org.uk/academics/) with SAS/JMP at the end of the project, possibly including the PDRA. The PI is also in contact with the team of SETsquared (Southampton), a partner organisation between five Universities and several companies, in order to discuss possible collaborations with the member companies that are interested in performing experiments.

An important aspect of this grant proposal is the funding of a post-doctoral researcher to collaborate with the PI. This will provide the PDRA with an opportunity to further his/her academic or industrial career. The statistics Group within the School of Mathematical Sciences provides an excellent environment for developing of young researchers, as evidenced by the number of PDRAs here currently, and the past successes of junior researchers within the group. Supporting an early career researcher in this exciting and growing area of statistics is very important to the continuing position of the UK at the forefront of mathematical and statistical research.

Publications

10 25 50
 
Description The outcome of this project is dual. First, we proposed a methodology for the construction of supersaturated split-plot designs. We did this by proposing a new Bayesian composite optimality criterion which can be used for producing efficient designs with good estimation and prediction properties. These designs are cost-efficient since their size is small and can be performed in restricted randomised conditions. For the very first time two very useful classes of designs (supersaturated designs and split-plot designs) are combined in such a general way. R script will be made available with publication, which practitioners with a basic understanding of R would be able to apply.

At the second stage, we propose an effective method (evaluated via a simulation study) for the analysis of data from these experiments. The method combines hierarchical Bayes techniques and shrinkage methodology. As expected, this was a difficult task to succeed, because of the small number of observations and the induced bias from the split-plot structure. However, the proposed method works well and our R scripts will be available as well with the publication of our results.

So we believe that we met our original objectives since we provide a concrete methodology to the scientists who can now design cost efficient experiments and analyse their produced data in the right way. In addition, we believe that the research on the proposed class of the designs just began and this work will give the motivation to other researchers and to us to propose efficient designs under these constrains and alternative analysis methods.
Exploitation Route We believe that these designs will provide a very good alternative to industrial experimenters and material scientists. We already have contacts with the tribology department at University of Southampton in which they have problems where these designs can offer a solution. The feedback that we got by presenting our results in conferences and seminars was very useful, since our peers told us that we provide an effective solution to several situation in which till now there was not a correct answer.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Construction,Environment,Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description During my presentation in RBRAS/SEAGRO, Lavras, Brazil - July 2017. Invited Talk Title: Applying supersaturated split-plot designs to a tribocorrosion experiment. (Presenter K. Mylona), a researcher, from Janssen Pharmaceutical in Czech Republic, asked me to send him his presentation. He said that these designs might be useful for experiments that they perform. I sent my presentation to him, so I hope that in the near future these designs will be applied to the industrial sector. Since then we had the opportunity to collaborate on a real case scenario where a split-plot design was needed, however this was not a supersaturated case. Very recently during my presentation in JEDE 5 conference in Spain - October 2021. Plenary Talk Title: Supersaturated split-plot experiments in industry: Construction and data analysis (Presenter K. Mylona), an academic from the audience asked me to send my presentation afterwards and recognised the possible importance of this type of designs. This project has given rise to several important research questions, and for this reason, I have added two relevant PhD projects at the PhD project list in our department. This list is being shared with potential PhD candidates who express their interest to do a PhD in statistics at King's College London.
Sector Education,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description Multi-objective optimal design of experiments
Amount £814,578 (GBP)
Funding ID EP/T021624/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 12/2020 
End 03/2024
 
Title Analysis method 
Description Through this project we proposed a statistical analysis method to analyse data from supersaturated split-plot experiments. This was a challenging task since the sample size is very small and the experiment is restricted randomised. However, our proposed method performs well with respect to correct model and variable selection outcomes. Specifically the method applies empirical Bayes to the estimation of the variance components and a shrinkage method that uses the SCAD penalty function for the estimation of the fixed effects. 
Type Of Material Data analysis technique 
Year Produced 2016 
Provided To Others? No  
Impact The supersaturated split-plot designs are very cost efficient designs however there was no reliable technique till now at the literature for the analysis of data from this kind of experiments. Our method will give confidence to the practitioners to use this kind of cost-efficient experiments to collect data, since now they will be able to analyse them as well in an efficient way. When we published the method, all our R-programs for the application of the analysis will be available as well. 
 
Title Construction method 
Description A computer efficient coordinate exchange algorithm was proposed as an outcome of this project. Our algorithm incorporates our new Bayesian composite criterion for the construction of Bayesian D-optimal and D_s-optimal supersaturated split-plot designs. 
Type Of Material Computer model/algorithm 
Year Produced 2016 
Provided To Others? No  
Impact R script will be made available with publication of the corresponding, which practitioners with a basic understanding of R would be able to apply. The practitioners will be able to construct easily a supersaturated split-plot design. Currently, there is no available program for this purpose. 
 
Description Conferences and invited seminars 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact We attended several international conferences, in which we presented the outcome of the project, getting useful feedback for top researchers around the work and motivating practitioners to design their experiments in a cost efficient way.

Selected conferences:

1. ISI 2015, Rio de Janeiro, Brazil - August 2015.
Contributed Talk Title: Supersaturated Split-Plot Designs.
2. Seminar at Industrial Engineering School, UCLM, Spain - November 2015.
Invited Talk Title: Supersaturated Designs with Restricted Randomisation: An Application to a
Tribology Experiment.
3. DEMA 2015, Sydney, Australia - December 2015.
Invited Talk Title: Supersaturated Split-Plot Designs: Construction and Statistical Modelling.
4. ERCIM2015, London, UK, December 2015
Invited talk title: Supersaturated split-plot screening experiments (Presenter: Emily S Matthews)
4. mODa11, Hamminkeln-Dingden, Germany - June 2016.
Invited Talk Title: Supersaturated Multi-Stratum Designs: Construction and Modelling.
5. ICISE2016, Palermo, Italy - June 2016.
Invited Talk Title: Applying supersaturated designs under restricted randomisation to a tribocorrosion
experiment.
6. ENBIS-16, Sheffield, UK - September 2016
Invited talk title: Supersaturated Split-Plot Screening Experiments (Presenter: Emily S Matthews)
7. RBRAS/SEAGRO, Lavras, Brazil - July 2017.
Invited Talk Title: Applying supersaturated split-plot designs to a tribocorrosion experiment.
8. Seminar at Department of Biostatistics, UNESP Botucatu, Brazil - August 2017. Invited Talk Title: Supersaturated split-plot designs and industrial applications.
9. SMBD2018, Madrid, Spain - June 2018. Contributed Talk Title: Supersaturated split-plot experiments and industrial applications.
10. ISBIS2018, Piraeus, Greece - July 2018. Invited Talk Title: Supersaturated split-plot experiments.
11. Seminar at MSG seminar series, University of Manchester - December 2018. Invited Talk Title: Supersaturated split-plot designs for industrial experimentation.
12. AISC 2021, Virtual Conference, The University of North Carolina at Greensboro - October 2021. Invited Talk Title: Supersaturated split-plot experiments in industry.
13. JEDE V, Almería, Spain, University of Almería. Plenary talk title: Supersaturated split-plot experiments in industry: Construction and data analysis

I was also invited to give two seminar presentations, specifically on this topic, at Queen Mary University, London and at Bicocca University in Milan. The talks were postponed because of COVID19.
Year(s) Of Engagement Activity 2015,2016,2017,2018,2021
 
Description Meet the fellow 2016 event (Madrid, Spain) 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact We presented our research projects in a dynamic and interactive way, including hands-on experiments, presentations and projections.
Year(s) Of Engagement Activity 2016