MobSec: Malware and Security in the Mobile Age

Lead Research Organisation: Royal Holloway University of London

Department Name: Information Security

Abstract

With more than 1 billion of activations reported on Sep 2013, Android mobile devices have become ubiquitous with trends showing that such a pace is unlikely slowing down. Android devices are extremely appealing: powerful, with a functional and easy-to-use user interface to access sensitive user and enterprise data, they can easily replace traditional computing devices, especially when information is consumed rather than produced. Application marketplaces, such as Google Play, drive the entire economy of mobile applications. For instance, with more than 1 million installed apps and a share of 35%, Google Play has generated revenues exceeding 9 billion USD. Such a wealthy and quite unique ecosystem with high turnovers and access to sensitive data has unfortunately also attracted the interests of cybercriminals, with malware now hit- ting Android devices at an alarmingly rising pace. Privacy breaches (e.g., access to address book and GPS coordinates), monetization through premium SMS and calls, and colluding malware to bypass 2-factor authentication schemes have become real threats. Recent studies report how mobile marketplaces have been abused to host malware or seemingly legitimate applications embedding malicious components. This clearly reflects the shift from an environment in which malware was developed for fun, to the current situation, where malware is spread for financial profit.

Given the limitations of the state-of-the-art just outlined and according to the security roadmap provided by the European Network of Excellence SysSec, it is clear that "[...] more research focused on the development of defensive tools and techniques that can be deployed to the current smartphone systems to detect and prevent attacks against the device and its applications is needed". MobSec wants to fill this gap with a well-rounded practical research proposal.

The goal of MobSec is to improve the security of mobile devices by reducing the risk from installing and using third party applications.

Our research objectives build on each other to achieve this goal: First, we will develop dynamic analyses to automatically, faithfully and comprehensively construct models of application behavior. We will address the problem of incompleteness in dynamic analysis by replaying human interaction traces and complementing them with systematic exploration using symbolic execution. Once we are able to build models containing the interesting behavioral traits of mobile malware, we focus on detecting and containing malicious behavior. We initially target information leakage by investigating evasion-resistant information leakage detection techniques and later generalize to distinguish malicious from benign apps. To handle cases in which detection is not possible, we contain potential threats by decomposing apps in logical components: this enables the enforcement of security policies and characterization of per-component behaviors, which, being more specific, allow us to detect behavior of malicious components embedded in seemingly legitimate apps. Finally, MobSec aims at exploring virtualization extensions of CPUs to open up the possibility of in-device implementation of the aforementioned analyses.

Planned Impact

The goal of MobSec is to improve the security of mobile devices by
reducing the risk from installing and using third party applications.

We aim at publishing the results of MobSec in top or well-known venues
and to organize a two-day workshop on the subject of mobile security.
The workshop aims at bringing together all the project collaborators,
academic researchers and industry practitioners with interest MobSec's
topic and research objectives. The goal of the workshop is to narrow
the gap that nowadays exists between security research carried out in
academia and industry to face common threats.

We likewise expect MobSec project to generate technologies and tools
that we would deploy in the industry (e.g., at McAfee, MobSec project
partner), and raise the quality of app analysis resulting in the
improved protection for the users---growth of the true positives and
reduction of false positives. The results of the project should also
assist in building better defenses in the future operating systems
(see the statement of support for more details). Moreover, MobSec
results will also be beneficial to a number of institutions and
professional networks interested in research outcomes in the field,
such as Imperial College London, Ruhr University Bochum, FORTH-ICS,
Politecnico di Milano, and National University Singapore, the
EPSRC-funded Network in Internet and Mobile Malicious Software
(NIMBUS), the EU FP7 NoE SysSec, and the EU FP7 CSA CyberROAD aimed at
the development of a cybercrime and cyber-terrorism research roadmap,
with whom we have strong professional and collaborative links.

We plan likewise to open MobSec analyses framework, results, and data
[*] not only to industry and academia, but to the society at large,
offering the opportunity to submit and analyze mobile apps for which a
deeper understanding or behavioral detection model is wanted,
according to the research objectives of MobSec.

In short, we hope that the above outlined links will foster impact in
three ways: it will enable us to promote the results MobSec to
industry, academia, and the society at large; it will provide
real-world valuable data of great importance to evaluate the
effectiveness of MobSec in real-world settings; and it will strengthen
the research collaborative efforts between academia and industry
furthermore to address challenging current and upcoming problems.

[*] for all the non-NDA data.

Funded Value:

£747,776

Funded Period:

Nov 14 - Aug 18

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/L022710/1

Principal Investigator:

Lorenzo Cavallaro

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (25%)

Mobile Computing (25%)

Modelling & simul. of IT sys. (25%)

Networks & Distributed Systems (25%)

Organisations

People	ORCID iD
Lorenzo Cavallaro (Principal Investigator)
Johannes Kinder (Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Repel D (2017) Modular Synthesis of Heap Exploits

Suarez-Tangil G (2017) DroidSieve

Tam K (2015) CopperDroid: Automatic Reconstruction of Android Malware Behaviors

Tam K (2017) The Evolution of Android Malware and Android Analysis Techniques in ACM Computing Surveys

Wagner J (2015) High System-Code Security with Low Overhead

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Intellectual Property
Software and Technical Products
Engagement Activities


Description	We have developed a system to perform dynamic analysis of Android apps. The novelty of our research lies in the fact that the code to reconstruct the behavior of such apps is automatically generated and work seamlessly across all vanilla Android versions. The behavioral profiles are then fed to machine learning algorithms to classify Android malware in a family of threats. Initial experiments on concept drift detection, i.e., malicious objects that are too dissimilar to the one observed so far, are properly identified, which opens the possibility of planning proper machine learning retraining strategy. We have been able to understand and detect when ML models start decaying and measure the effect of concept drift in security contexts. The main output has been published in top-tier conferences (e.g., NDSS 2015, USENIX Security 2017 and 2019) or well-known specialized workshops (e.g., AISec).
Exploitation Route	The dynamic analysis system we have developed can be integrated by vendors in Android market to reconstruct apps' actions and identify dodgy behaviors. Similar reasoning applies to our approach to detect and measure the effect of concept drift. We have been having fruitful conversations with Google (Android Security Team), Facebook (Anti-abuse Team) and Huawei (Cloud Security Europe).
Sectors	Digital/Communication/Information Technologies (including Software)
URL	https://s2lab.kcl.ac.uk


Description	The findings of this research project helped to better understand the effect of evolving threats in computer security contexts. In the process, the breakthrough of this research project identified several experimental biases that affect the performance of learning-based models in non-stationary contexts. This research project highlighted the importance to evaluate computer security tasks in a time-aware perspective to understand how quickly performance decay over time and how to compare with sound methodology alternative options. Several institutions in academia and industry have been adopting the framework that's been developed as one of the outcomes of this research project. See https://s2lab.cs.ucl.ac.uk and https://s2lab.cs.ucl.ac.uk/projects/tesseract for further details.
First Year Of Impact	2019
Sector	Digital/Communication/Information Technologies (including Software),Education,Other
Impact Types	Societal


Description	GCHQ Small Grants scheme 2015-2016
Amount	£39,000 (GBP)
Organisation	Government Communications Headquarters (GCHQ)
Sector	Public
Country	United Kingdom
Start	01/2016
End	03/2016


Description	McAfee Labs donation
Amount	$80,000 (USD)
Organisation	McAfee
Sector	Private
Country	United States
Start	06/2014
End	06/2016


Description	NCSC Small Grants scheme 2017-2018
Amount	£20,000 (GBP)
Organisation	National Cyber Security Centre
Sector	Public
Country	United Kingdom
Start	11/2017
End	01/2018


Description	NVIDIA GPU donation
Amount	£3,000 (GBP)
Organisation	NVIDIA
Sector	Private
Country	Global
Start


Title	Conformal Evaluator
Description	We have developed a statistical machine learning evaluation framework to provide a quantifiable assessment of the quality of a given machine learning classification. Not only this enables to understand how well an approach may be performing in real-life deployment, but it also provides metrics that can be leveraged to detect concept drift and thus decaying in the classifier performances (suggesting retraining strategies) in realistic settings.
Type Of Material	Improvements to research infrastructure
Year Produced	2017
Provided To Others?	Yes
Impact	N/A yet. We are filing a patent and we are using this approach in the output generated by MobSec: Security in the Mobile Age EPSRC research grant. We are in the process of publishing the results of this approach, including source code, to enable other research groups to build on this outcome.


Title	CopperDroid
Description	We have developed the infrastructure to enable dynamic analysis and classification of Android applications at scale; we are in the process of finalizing a RESTful API to provide free use of the service to practitioners and researchers.
Type Of Material	Improvements to research infrastructure
Provided To Others?	No
Impact	We are discussing with Google and McAfee for a potential integration of our analysis system in their backend infrastructure.


Title	Conformal Evaluator
Description	We have developed a statistical machine learning evaluation framework to provide a quantifiable assessment of the quality of a given machine learning classification. Not only this enables to understand how well an approach may be performing in real-life deployment, but it also provides metrics that can be leveraged to detect concept drift and thus decaying in the classifier performances (suggesting retraining strategies) in realistic settings.
Type Of Material	Data analysis technique
Provided To Others?	No
Impact	N/A yet. We are filing a patent and we are using this approach in the output generated by MobSec: Security in the Mobile Age EPSRC research grant. We are in the process of publishing the results of this approach, including source code, to enable other research groups to build on this outcome.


Description	TU Munich
Organisation	Aston University
Department	Computer Science
Country	United Kingdom
Sector	Academic/University
PI Contribution	We are working on a joint research proposal to be submitted to DGF (TU Munich side) and EPSRC (RHUL side) to address coverage issues faced when analyzing Android apps. We bring our expertise in dynamic analysis and symbolic execution, while TU Munich brings its expertise in static analysis.
Collaborator Contribution	We are working on a joint research proposal to be submitted to DGF (TU Munich side) and EPSRC (RHUL side) to address coverage issues faced when analyzing Android apps. We bring our expertise in dynamic analysis and symbolic execution, while TU Munich brings its expertise in static analysis.
Impact	Joint research proposal to acquire additional research funding (RHUL is going to budget 2 PDRA)
Start Year	2016


Description	University of Luxembourg
Organisation	University of Luxembourg
Department	Interdisciplinary Centre for Security, Reliability and Trust (SnT)
Country	Luxembourg
Sector	Academic/University
PI Contribution	We have been discussing thematic revolving around Android apps analysis and pitfalls in current datasets. This has resulted in co-authoring a research paper that aims at providing insights into the landscape of repackaged (piggybacked) Android apps. Contributions have been merely based on brainstorming, discussions, feedback on manuscript.
Collaborator Contribution	We have been discussing thematic revolving around Android apps analysis and pitfalls in current datasets. This has resulted in co-authoring a research paper that aims at providing insights into the landscape of repackaged (piggybacked) Android apps. Contributions have been merely based on brainstorming, discussions, feedback on manuscript.
Impact	No output yet. We are at an early stage, but we have been discussing research directions and currently co-authoring one research paper.
Start Year	2016


Title	METHOD OF MONITORING THE PERFORMANCE OF A MACHINE LEARNING ALGORITHM
Description	A crucial requirement for building sustainable learning models is to train on a wide variety of samples. Unfortunately, objects on which the learned models are used may evolve and the learned models may no longer work well. The invention provides a framework to identify aging classification models in vivo during deployment(concept drift), much before the machine learning model's performance starts to degrade. A statistical comparison of samples seen during deployment with those used to train the model is used, thereby building metrics for classification quality.
IP Reference	WO2019002603
Protection	Patent application published
Year Protection Granted	2019
Licensed	No
Impact	We have been contacted by several academic institutions that want access to our code; we are working on a license suitable for academic researchers as well as for industrial partners. To this end, we are in contact with Huawei Technologies to develop a research impact potentially through licensing.


Title	Conformal evaluator
Description	This is the python library that implements conformal evaluator, an framework to statistically assess the quality of a broad range of machine learning algorithms.
Type Of Technology	Software
Year Produced	2016
Impact	We are using this evaluation internally across projects, but we plan to release the python library open source for the community to provide statistical evaluation to machine learning tasks.


Title	CopperDroid and related machine learning infrastructure
Description	CopperDroid is a dynamic analysis framework to reconstruct the behavior of Android apps. Beside providing information to analysts, the reconstructed behaviors are fed to machine learning to enable automated classification of Android apps and malware.
Type Of Technology	Software
Year Produced	2015
Impact	We are engaged in a number of conversation with industrial partners (e.g., McAfee Labs, Qualcomm, and Google) and academia (e.g., University of Luxembourg, National University Singapore, TU Munich) to further monetize on the capability analysis of CopperDroid.
URL	http://copperdroid.isg.rhul.ac.uk


Title	Tesseract: Eliminating Experimental Bias in Malware Classification across Space and Time
Description	See https://s2lab.kcl.ac.uk/projects/tesseract/
Type Of Technology	Software
Year Produced	2019
Open Source License?	Yes
Impact	See https://s2lab.kcl.ac.uk/projects/tesseract/
URL	https://s2lab.kcl.ac.uk/projects/tesseract/


Title	Transcend - Detection of Concept Drift in Malware Classifiers
Description	See https://s2lab.kcl.ac.uk/papers/files/usenixsec2017.pdf
Type Of Technology	Software
Year Produced	2017
Open Source License?	Yes
Impact	We have refactored and open-sourced Transcend in 2019


Description	A number of talks given to industry and academia
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	I have given a number of talks to further disseminate the outcome of the research carried out by the 2 EPSRC grants I am currently PI for. The following is just an excerpt, but I will update the list wit proper entries soon. The talks often cover topics carried out by both grants: OWASP AppSec EU Keynote 2014, Dagstuhl 2014, National Cyber Crime Unit 20145, BlackHat London Mobile Summit 2015, Georgia Tech 2015, Stony Brook University, 2015, Qualcomm Inc. 2015, Google 2015, University of Catania 2015, University of Luxembourg 2015, IMDEA Software 2016, Kyushu University 2016, NIMBUS (EPSRC) 2016, Polytechnic University of Hong Kong 2016
Year(s) Of Engagement Activity	2014,2015,2016


Description	Automatic Analysis and Classification of Android Malware (Concept Drift Detection)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	This talk outlines how the analysis we have been exploring in "MobSec: Security in the Mobile Age" EPSRC projects influenced "Mining the Network Behaviour of Bots" EPSRC grant and viceversa. In particular, we have been introducing a statistical machine learning evaluation framework to identify concept drift. This is applicable not only to either of the domains explored in such research projects, but to other fields as well.
Year(s) Of Engagement Activity	2016,2017


Description	Invited talks at Huawei Germany, Huawei Finland, University of Bologna, King's College London
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Several talks on promoting dissemination results of the USENIX Sec 2018 paper. The aim is to pursue research impact through licensing of this research.
Year(s) Of Engagement Activity	2017

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications