Gene Expression Programming - a new machine learning technique for supervised and unsupervised classification

Lead Research Organisation: Brunel University London
Department Name: Sch of Engineering and Design

Abstract

Many scientific, engineering and business fields such as genetics, medicine, environment science and engineering, physics, astronomy, finance, and marketing are facing common challenges in dealing with complex data for extracting field-specific knowledge. Efficient data analysis techniques are needed in order to intelligently assist the user in extracting this knowledge. This project will address this need using the basic ideas of a recently developed computer algorithm, Gene Expression Programming, for the development of novel evolutionary algorithms techniques and novel supervised and unsupervised data classification algorithms. The project will develop and exploit novel homologous genetic operators, and mechanisms to control the redundant information in the solutions provided by the algorithm in order to increase its efficiency. These developments will be combined with state-of-the-art statistical methods such as boosting learning in order to create efficient data classification algorithms.The methods and algorithms developed in the project will be implemented in software applications made available as open-source in order to maximize the spectrum of the beneficiaries of the project outcomes.

Publications

10 25 50
 
Description The algorithm developed novel data analysis techniques based on a specialised version of an Evolutionary algorithm and developed theoretical studies of this algorithm
Exploitation Route The findings are implemented in software applications which are in the process of being released as open source. They can be available not only to the specialist in the area of evolutionary algorithms but also to a large range of practitioners from scientists and engineers to financial and health service specialists.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology,Retail,Transport

 
Description The findings were used mainly by advancing the field of evolutionary algorithm and related data analysis techniques. Some of the data analysis techniques developed were applied to particle physics data analysis.
First Year Of Impact 2008
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Cultural

 
Title Boosted GEP 
Description The data analysis techniques is based on a hybrid algorithm which combines the Gene Expression Programming and the AdaBoost algorithms for classification problems. 
Type Of Material Data analysis technique 
Provided To Others? No  
Impact This techniques provides improved solutions to a classification problems with a reduced number of iterations of the searching process of the solution space. 
 
Title Enhanced GEP 
Description The data analysis techniques extended the capabilities of the Gene Expression Programming algorithm by using an alternative representation of the candidate solution and a truncated evolution process of the candidate solution for classification problems. 
Type Of Material Data analysis technique 
Year Produced 2008 
Provided To Others? Yes  
Impact The data analysis technique was used on experimental data from particle physics experiments providing alternative methods for separating signal from background in such experiments. 
 
Title BGEP 
Description The software implements the hybrid algorithm based on Gene Expression Programming and AdaBoost algorithms developed in this project. 
Type Of Technology Software 
Year Produced 2012 
Impact The impact is on this project only at this time. The software is in the process of being released as open source. 
 
Title GEP 
Description The software is a new implementation of the Gene Expression Programming algorithm. 
Type Of Technology Software 
Year Produced 2010 
Impact The impact is on this project only at this time. The software is in the process of being released as open source.