MobSec: Malware and Security in the Mobile Age

Lead Research Organisation: Royal Holloway University of London
Department Name: Information Security

Abstract

With more than 1 billion of activations reported on Sep 2013, Android mobile devices have become ubiquitous with trends showing that such a pace is unlikely slowing down. Android devices are extremely appealing: powerful, with a functional and easy-to-use user interface to access sensitive user and enterprise data, they can easily replace traditional computing devices, especially when information is consumed rather than produced. Application marketplaces, such as Google Play, drive the entire economy of mobile applications. For instance, with more than 1 million installed apps and a share of 35%, Google Play has generated revenues exceeding 9 billion USD. Such a wealthy and quite unique ecosystem with high turnovers and access to sensitive data has unfortunately also attracted the interests of cybercriminals, with malware now hit- ting Android devices at an alarmingly rising pace. Privacy breaches (e.g., access to address book and GPS coordinates), monetization through premium SMS and calls, and colluding malware to bypass 2-factor authentication schemes have become real threats. Recent studies report how mobile marketplaces have been abused to host malware or seemingly legitimate applications embedding malicious components. This clearly reflects the shift from an environment in which malware was developed for fun, to the current situation, where malware is spread for financial profit.

Given the limitations of the state-of-the-art just outlined and according to the security roadmap provided by the European Network of Excellence SysSec, it is clear that "[...] more research focused on the development of defensive tools and techniques that can be deployed to the current smartphone systems to detect and prevent attacks against the device and its applications is needed". MobSec wants to fill this gap with a well-rounded practical research proposal.

The goal of MobSec is to improve the security of mobile devices by reducing the risk from installing and using third party applications.

Our research objectives build on each other to achieve this goal: First, we will develop dynamic analyses to automatically, faithfully and comprehensively construct models of application behavior. We will address the problem of incompleteness in dynamic analysis by replaying human interaction traces and complementing them with systematic exploration using symbolic execution. Once we are able to build models containing the interesting behavioral traits of mobile malware, we focus on detecting and containing malicious behavior. We initially target information leakage by investigating evasion-resistant information leakage detection techniques and later generalize to distinguish malicious from benign apps. To handle cases in which detection is not possible, we contain potential threats by decomposing apps in logical components: this enables the enforcement of security policies and characterization of per-component behaviors, which, being more specific, allow us to detect behavior of malicious components embedded in seemingly legitimate apps. Finally, MobSec aims at exploring virtualization extensions of CPUs to open up the possibility of in-device implementation of the aforementioned analyses.

Planned Impact

The goal of MobSec is to improve the security of mobile devices by
reducing the risk from installing and using third party applications.

We aim at publishing the results of MobSec in top or well-known venues
and to organize a two-day workshop on the subject of mobile security.
The workshop aims at bringing together all the project collaborators,
academic researchers and industry practitioners with interest MobSec's
topic and research objectives. The goal of the workshop is to narrow
the gap that nowadays exists between security research carried out in
academia and industry to face common threats.

We likewise expect MobSec project to generate technologies and tools
that we would deploy in the industry (e.g., at McAfee, MobSec project
partner), and raise the quality of app analysis resulting in the
improved protection for the users---growth of the true positives and
reduction of false positives. The results of the project should also
assist in building better defenses in the future operating systems
(see the statement of support for more details). Moreover, MobSec
results will also be beneficial to a number of institutions and
professional networks interested in research outcomes in the field,
such as Imperial College London, Ruhr University Bochum, FORTH-ICS,
Politecnico di Milano, and National University Singapore, the
EPSRC-funded Network in Internet and Mobile Malicious Software
(NIMBUS), the EU FP7 NoE SysSec, and the EU FP7 CSA CyberROAD aimed at
the development of a cybercrime and cyber-terrorism research roadmap,
with whom we have strong professional and collaborative links.

We plan likewise to open MobSec analyses framework, results, and data
[*] not only to industry and academia, but to the society at large,
offering the opportunity to submit and analyze mobile apps for which a
deeper understanding or behavioral detection model is wanted,
according to the research objectives of MobSec.

In short, we hope that the above outlined links will foster impact in
three ways: it will enable us to promote the results MobSec to
industry, academia, and the society at large; it will provide
real-world valuable data of great importance to evaluate the
effectiveness of MobSec in real-world settings; and it will strengthen
the research collaborative efforts between academia and industry
furthermore to address challenging current and upcoming problems.

[*] for all the non-NDA data.

Publications

10 25 50
 
Description We have developed a system to perform dynamic analysis of Android apps. The novelty of our research lies in the fact that the code to reconstruct the behavior of such apps is automatically generated and work seamlessly across all vanilla Android versions. The behavioral profiles are then fed to machine learning algorithms to classify Android malware in a family of threats. Initial experiments on concept drift detection, i.e., malicious objects that are too dissimilar to the one observed so far, are properly identified, which opens the possibility of planning proper machine learning retraining strategy. We have been able to understand and detect when ML models start decaying and measure the effect of concept drift in security contexts. The main output has been published in top-tier conferences (e.g., NDSS 2015, USENIX Security 2017 and 2019) or well-known specialized workshops (e.g., AISec).
Exploitation Route The dynamic analysis system we have developed can be integrated by vendors in Android market to reconstruct apps' actions and identify dodgy behaviors. Similar reasoning applies to our approach to detect and measure the effect of concept drift. We have been having fruitful conversations with Google (Android Security Team), Facebook (Anti-abuse Team) and Huawei (Cloud Security Europe).
Sectors Digital/Communication/Information Technologies (including Software)

URL https://s2lab.kcl.ac.uk
 
Description The findings of this research project helped to better understand the effect of evolving threats in computer security contexts. In the process, the breakthrough of this research project identified several experimental biases that affect the performance of learning-based models in non-stationary contexts. This research project highlighted the importance to evaluate computer security tasks in a time-aware perspective to understand how quickly performance decay over time and how to compare with sound methodology alternative options. Several institutions in academia and industry have been adopting the framework that's been developed as one of the outcomes of this research project. See https://s2lab.cs.ucl.ac.uk and https://s2lab.cs.ucl.ac.uk/projects/tesseract for further details.
First Year Of Impact 2019
Sector Digital/Communication/Information Technologies (including Software),Education,Other
Impact Types Societal

 
Description GCHQ Small Grants scheme 2015-2016
Amount £39,000 (GBP)
Organisation Government Communications Headquarters (GCHQ) 
Sector Public
Country United Kingdom
Start 01/2016 
End 03/2016
 
Description McAfee Labs donation
Amount $80,000 (USD)
Organisation McAfee 
Sector Private
Country United States
Start 06/2014 
End 06/2016
 
Description NCSC Small Grants scheme 2017-2018
Amount £20,000 (GBP)
Organisation National Cyber Security Centre 
Sector Public
Country United Kingdom
Start 11/2017 
End 01/2018
 
Description NVIDIA GPU donation
Amount £3,000 (GBP)
Organisation NVIDIA 
Sector Private
Country Global
Start  
 
Title Conformal Evaluator 
Description We have developed a statistical machine learning evaluation framework to provide a quantifiable assessment of the quality of a given machine learning classification. Not only this enables to understand how well an approach may be performing in real-life deployment, but it also provides metrics that can be leveraged to detect concept drift and thus decaying in the classifier performances (suggesting retraining strategies) in realistic settings. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? Yes  
Impact N/A yet. We are filing a patent and we are using this approach in the output generated by MobSec: Security in the Mobile Age EPSRC research grant. We are in the process of publishing the results of this approach, including source code, to enable other research groups to build on this outcome. 
 
Title CopperDroid 
Description We have developed the infrastructure to enable dynamic analysis and classification of Android applications at scale; we are in the process of finalizing a RESTful API to provide free use of the service to practitioners and researchers. 
Type Of Material Improvements to research infrastructure 
Provided To Others? No  
Impact We are discussing with Google and McAfee for a potential integration of our analysis system in their backend infrastructure. 
 
Title Conformal Evaluator 
Description We have developed a statistical machine learning evaluation framework to provide a quantifiable assessment of the quality of a given machine learning classification. Not only this enables to understand how well an approach may be performing in real-life deployment, but it also provides metrics that can be leveraged to detect concept drift and thus decaying in the classifier performances (suggesting retraining strategies) in realistic settings. 
Type Of Material Data analysis technique 
Provided To Others? No  
Impact N/A yet. We are filing a patent and we are using this approach in the output generated by MobSec: Security in the Mobile Age EPSRC research grant. We are in the process of publishing the results of this approach, including source code, to enable other research groups to build on this outcome. 
 
Description TU Munich 
Organisation Aston University
Department Computer Science
Country United Kingdom 
Sector Academic/University 
PI Contribution We are working on a joint research proposal to be submitted to DGF (TU Munich side) and EPSRC (RHUL side) to address coverage issues faced when analyzing Android apps. We bring our expertise in dynamic analysis and symbolic execution, while TU Munich brings its expertise in static analysis.
Collaborator Contribution We are working on a joint research proposal to be submitted to DGF (TU Munich side) and EPSRC (RHUL side) to address coverage issues faced when analyzing Android apps. We bring our expertise in dynamic analysis and symbolic execution, while TU Munich brings its expertise in static analysis.
Impact Joint research proposal to acquire additional research funding (RHUL is going to budget 2 PDRA)
Start Year 2016
 
Description University of Luxembourg 
Organisation University of Luxembourg
Department Interdisciplinary Centre for Security, Reliability and Trust (SnT)
Country Luxembourg 
Sector Academic/University 
PI Contribution We have been discussing thematic revolving around Android apps analysis and pitfalls in current datasets. This has resulted in co-authoring a research paper that aims at providing insights into the landscape of repackaged (piggybacked) Android apps. Contributions have been merely based on brainstorming, discussions, feedback on manuscript.
Collaborator Contribution We have been discussing thematic revolving around Android apps analysis and pitfalls in current datasets. This has resulted in co-authoring a research paper that aims at providing insights into the landscape of repackaged (piggybacked) Android apps. Contributions have been merely based on brainstorming, discussions, feedback on manuscript.
Impact No output yet. We are at an early stage, but we have been discussing research directions and currently co-authoring one research paper.
Start Year 2016
 
Title METHOD OF MONITORING THE PERFORMANCE OF A MACHINE LEARNING ALGORITHM 
Description A crucial requirement for building sustainable learning models is to train on a wide variety of samples. Unfortunately, objects on which the learned models are used may evolve and the learned models may no longer work well. The invention provides a framework to identify aging classification models in vivo during deployment(concept drift), much before the machine learning model's performance starts to degrade. A statistical comparison of samples seen during deployment with those used to train the model is used, thereby building metrics for classification quality. 
IP Reference WO2019002603 
Protection Patent application published
Year Protection Granted 2019
Licensed No
Impact We have been contacted by several academic institutions that want access to our code; we are working on a license suitable for academic researchers as well as for industrial partners. To this end, we are in contact with Huawei Technologies to develop a research impact potentially through licensing.
 
Title Conformal evaluator 
Description This is the python library that implements conformal evaluator, an framework to statistically assess the quality of a broad range of machine learning algorithms. 
Type Of Technology Software 
Year Produced 2016 
Impact We are using this evaluation internally across projects, but we plan to release the python library open source for the community to provide statistical evaluation to machine learning tasks. 
 
Title CopperDroid and related machine learning infrastructure 
Description CopperDroid is a dynamic analysis framework to reconstruct the behavior of Android apps. Beside providing information to analysts, the reconstructed behaviors are fed to machine learning to enable automated classification of Android apps and malware. 
Type Of Technology Software 
Year Produced 2015 
Impact We are engaged in a number of conversation with industrial partners (e.g., McAfee Labs, Qualcomm, and Google) and academia (e.g., University of Luxembourg, National University Singapore, TU Munich) to further monetize on the capability analysis of CopperDroid. 
URL http://copperdroid.isg.rhul.ac.uk
 
Title Tesseract: Eliminating Experimental Bias in Malware Classification across Space and Time 
Description See https://s2lab.kcl.ac.uk/projects/tesseract/ 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact See https://s2lab.kcl.ac.uk/projects/tesseract/ 
URL https://s2lab.kcl.ac.uk/projects/tesseract/
 
Title Transcend - Detection of Concept Drift in Malware Classifiers 
Description See https://s2lab.kcl.ac.uk/papers/files/usenixsec2017.pdf 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact We have refactored and open-sourced Transcend in 2019 
 
Description A number of talks given to industry and academia 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact I have given a number of talks to further disseminate the outcome of the research carried out by the 2 EPSRC grants I am currently PI for. The following is just an excerpt, but I will update the list wit proper entries soon. The talks often cover topics carried out by both grants:

OWASP AppSec EU Keynote 2014, Dagstuhl 2014, National Cyber Crime Unit 20145, BlackHat London Mobile Summit 2015, Georgia Tech 2015, Stony Brook University, 2015, Qualcomm Inc. 2015, Google 2015, University of Catania 2015, University of Luxembourg 2015, IMDEA Software 2016, Kyushu University 2016, NIMBUS (EPSRC) 2016, Polytechnic University of Hong Kong 2016
Year(s) Of Engagement Activity 2014,2015,2016
 
Description Automatic Analysis and Classification of Android Malware (Concept Drift Detection) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact This talk outlines how the analysis we have been exploring in "MobSec: Security in the Mobile Age" EPSRC projects influenced "Mining the Network Behaviour of Bots" EPSRC grant and viceversa. In particular, we have been introducing a statistical machine learning evaluation framework to identify concept drift. This is applicable not only to either of the domains explored in such research projects, but to other fields as well.
Year(s) Of Engagement Activity 2016,2017
 
Description Invited talks at Huawei Germany, Huawei Finland, University of Bologna, King's College London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Several talks on promoting dissemination results of the USENIX Sec 2018 paper. The aim is to pursue research impact through licensing of this research.
Year(s) Of Engagement Activity 2017