Fuzzing for Information Leakage

Lead Research Organisation: University College London
Department Name: Computer Science

Abstract

The plan for my thesis is to apply the system testing technique of fuzzing to the domain of information
leakage. Working with my primary supervisor Dr David Clark, we have devised a rough plan to work
toward this. The following sections describe how I plan to focus my research over the coming 3 years,
split into three distinct stages:
i. Detecting Information Leakage using Fuzzing (10 Months)
Firstly, I plan to research and develop a technique for applying fuzzers to the task of searching for
information leakage. My initial idea for this proposes using hypertesting. Hypertesting in this case
referring to pairs of tests with matching low security inputs, but differing high security inputs. Fuzzing
traditionally uses regular testing, so I will either need to modify an existing fuzzer or write a new fuzzer in
order to generate hypertests.
I currently participate in a research group that is investigating automated repair of information leakage,
and we have recently had a short paper accepted to ASE 2021 - HyperGI: Automated Detection and
Repair of Information Flow Leakage - documenting a genetic improvement based approach to repair. In
order to evaluate the fitness of candidate repairs, an approximate measure of information leakage is
required, and for this a set of hypertests are ran against the candidate. Currently the hypertests are
generated using a heuristic, which works well for known leaks, however a fuzzing based approach would
certainly improve the system. The work that I have done in my first year individual project will also be
useful for generating more diverse test sets that can improve the functional testing of the system.
ii. Measuring Information Leakage using Fuzzing (10 Months)
A subtly distinct problem from detecting information leakage, measuring information leakage relies a little
more heavily on information theory. Measuring information leakage (also known as quantified information
flow - QIF - in literature) exactly is difficult, and in fact infeasible for either large input spaces, or very
complex programs. There is some older work using statistics to instead estimate a bound for the mutual
information between high security secrets and low security observables, yielding an estimate for the
information leakage. This work relied on generating random inputs, and the proof of concept worked on
small, integer only programs. More recently, those researching machine learning have taken an interest in
estimating mutual information, as it is directly useful for feature selection. Extending either the older
information leakage specific work, or the more recent machine learning work, I should be able to extend
the leakage detection technique to also measure the quantity of said leakage.
iii. Improving Fuzzing for Information Leakage (10 Months)
There are a number of interesting techniques that may prove to be useful in aiding the search for leakage;
as a start, the concept that my first year project has been based on, which increases the input diversity of
generated tests. I could look into a grammar based fuzzing approach. Or I could apply some of Marcel
Boehme's ideas around cost estimation and / or coverage entropy (prioritising fuzzing of inputs with rarely
seen coverage properties) to improve the technique. I should explore as many ideas as possible, and this
should be possible given the experience I will have at evaluating the effectiveness of fuzzing for
information leakage by this point
This initial plan leaves time for further review of specifically relevant literature at the time that it is useful
(rather than budgeting a single block at the beginning of the PhD). There is also time set aside for
formatting the work into a thesis.

Planned Impact

The EPSRC Centre for Doctoral Training in Cybersecurity will train over 55 experts in multi-disciplinary aspects of cybersecurity, from engineering to crime science and public policy.

Short term impacts are associated with the research outputs of the 55+ research projects that will be undertaken as part of the doctoral studies of CDT students. Each project will tackle an important cybersecurity problem, propose and evaluate solutions, interventions and policy options. Students will publish those in international peer-reviewed journals, but also disseminate those through blog posts and material geared towards decision makers and experts in adjacent fields. Through industry placements relating to their projects, all students will have the opportunity to implement and evaluate their ideas within real-world organizations, to achieve short term impact in solving cybersecurity problems.

In the longer term graduates of the CDT will assume leading positions within industry, goverment, law enforcement, the third sector and academia to increase the capacity of the UK in being a leader in cybersecurity. From those leadership positions they will assess options and formulate effective interventions to tackle cybercrime, secure the UK's infrastructure, establish norms of cooperation between industries and government to secure IT systems, and become leading researcher and scholars further increasing the UK's capacity in cybersecurity in the years to come. The last impact is likely to be significant give that currently many higher education training programs do not have capacity to provide cybersecurity training at undergraduate or graduate levels, particularly in non-technical fields.

The full details of our plan to achieve impact can be found in the "Pathways to Impact" document.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S022503/1 01/04/2019 23/11/2028
2401210 Studentship EP/S022503/1 01/10/2020 30/09/2024 Daniel Blackwell