SeMaMatch: Semantic Malware Matching

Lead Research Organisation: University College London
Department Name: Computer Science

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Publications

10 25 50
publication icon
Menéndez HD (2019) Mimicking Anti-Viruses with Machine Learning and Entropy Profiles. in Entropy (Basel, Switzerland)

 
Description We have shown that compression programs and an initial zoo of malware can be used to detect malware with 98% accuracy. We have performed a series of experiments that establish the robustness of this claim. We have developed a tool (EnTS) that is more lightweight than using compression and has equivalent accuracy. Again, we have robust experimental evidence for our accuracy claim. We have engineered a new tool (EEE) that has successfully attacked the classification ability of EnTS as well as other compression and information theory based detection methods. EEE is based on search algorithms which attack weaknesses in the detection methods. We have applied EEE to modify malware submitted to the VirusTotal web site and analysed the co-evolution of detection and disguise of malware in a cutting edge context. We have used a similar approach to Android malware classification into families through the use of search algorithms to force misclassification during malware triage.
Exploitation Route These findings are sufficiently strong as to warrant both further research and commercialisation.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Energy,Manufacturing, including Industrial Biotechology,Retail,Security and Diplomacy,Transport,Other

URL https://github.com/hdg7/IagoDroid
 
Description Previous Research Associate for SeMaMatch has moved to FaceBook and has been using Kolmogorov complexity based ideas, especially the Normalised Compression Distance, as a similarity measure to solve code similarity problems at Facebook. Other previous Research Associate is now an academic at Middlesex University.
First Year Of Impact 2019
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Title EnTS ML database 
Description This dataset contains the Entropy Profiles for a subset of Kaggle malware Competition files, VirusShare Win32 malware using different packing systems and Windows benign-ware. The data is a training and test set for new machine learning techniques on entropy time series for malware detection and classification. 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
Impact None as yet. The data is not openly available until 
URL https://data.mendeley.com/datasets/rxnx8rzwph/draft
 
Description Collaboration with Charles III University, Madrid 
Organisation Charles III University of Madrid
Country Spain 
Sector Academic/University 
PI Contribution We have written a paper together (DOI) below and planned to collaborate further. The collaboration is now dormant.
Collaborator Contribution As above (the contributions are symmetric).
Impact DOI: 10.1016/j.eswa.2017.11.032
Start Year 2016
 
Title EEE: the evolutionary entropy based packer 
Description A modified packer (modifications to UPX) that uses search algorithms to produce programs that can evade virus detection based on entropy 
Type Of Technology Software 
Year Produced 2017 
Impact None so far 
URL https://github.com/hdg7/EEE
 
Title EnTS: Entropy Time Series Analysis tool 
Description This tool generates a simplified entropy profile of a file as a fixed length time series. 
Type Of Technology Webtool/Application 
Year Produced 2016 
Impact None yet. Note that, since the associated paper is under anonymous submission, we cannot openly publish the URL of the tool except via application to the PI. This will change once the paper is accepted. 
URL https://www.mendeley.com/sign-in/?routeTo=https%3A%2F%2Fapi.mendeley.com%2Foauth%2Fauthorize%3Fredir...
 
Title IagoDroid 
Description This is an academic tool that forces misclassification of Android Malware 
Type Of Technology Software 
Year Produced 2017 
Impact None as yet. 
URL https://github.com/hdg7/IagoDroid
 
Title MimicAV 
Description Allows mimicking of the behaviour of anti-virus programs 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact None so far 
URL https://github.com/hdg7/MimickAV
 
Description COW 27: workshop on Malware 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Workshop for academics involved in research on malware. Attended by some industrial people and government representatives. Main outcomes were knowledge and information sharing as well as some new collaborations and reinforcement of existing collaborations.
Year(s) Of Engagement Activity 2013
URL http://crest.cs.ucl.ac.uk/cow/27/
 
Description COW 41: workshop on Software Engineering and Computer Science using information theory 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Workshop for software engineering researchers to learn about different applications of information theory to software engineering and computer science problems. The previous RA gave a talk on the Journal of Computer Security submission, "Detecting Malware with Information Complexity".
Year(s) Of Engagement Activity 2015
URL http://crest.cs.ucl.ac.uk/cow/41/
 
Description Malware MSc course 2014-2015 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Presented overview of Journal of Computer Security submission entitled "Detecting Malware with Information Complexity" in the UCL MSc malware module. Later asked students to carry out a coursework based on available tools.
Year(s) Of Engagement Activity 2014
 
Description Malware MSc course 2015-16 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Presented an overview of the ideas in the Journal of Computer Security submission entitled "Detecting Malware with Information Complexity" to a UCL MSc module.
Year(s) Of Engagement Activity 2015
 
Description Malware MSc module 2016-2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Knowledge and Technology Transfer
Year(s) Of Engagement Activity 2016