Towards Practical Federated Analytics and Multi-Target Privacy Enhancing Technologies (PETs)

Lead Research Organisation: University of Warwick
Department Name: Computer Science

Abstract

Privacy Enhancing Technologies (or PETs) are techniques that allow for private data collection, private data analytics and privacy-preserving machine learning. Popular PETs include Differential Privacy, Secure Multiparty Computation and Homomorphic Encryption.
While these tools have much to offer, adoption of these techniques in practice is low and they do not easily apply to modern scenarios where there are many participants, often with limited processing power and small communication bandwidth. These issues pose a barrier to wide-spread industry adoption and large-scale deployments which would make privacy-preserving data science possible. There is therefore an urgent demand for breakthroughs in PETs to deliver on the promise of privacy-preserving machine learning and private data analysis.
Recent advances in the area are based on the idea of combining PETs with the concept of Federated Learning (FL). The core idea of FL is simple: To enable multiple parties to jointly train a machine learning model without sharing any local (private) data. More recently, the broader notion of Federated Analytics (FA) has emerged. The aim of FA is to take the core ideas of FL, with client's performing local computations over their data and making only the aggregated results available to a central server. Unlike FL, the focus is less on learning and more on collecting direct analytics about clients.
The objectives for this PhD are to substantially extend what is possible with privacy-preserving federated computation in both FL and FA settings and to contribute towards making end-to-end privacy-preserving data science pipelines a viable solution. The main focus of this project will be to develop new FA techniques. As FA is a relatively new field there are plenty of open directions from computing specific analytics such as quantiles and range queries to more generally addressing arbitrary aggregate computations.
A key direction will be creating techniques that allow us to combine FA methods with FL to form end-to-end private learning systems and private data analytics pipelines. This is a vital direction for making these technologies viable for industry use and for eventual widespread adoption. Research directions include:
1. Collecting analytics over time: An important application of analytics is collecting data on a user-group over time and analysing temporal trends. Collecting data on the same user can degrade privacy if not handled carefully. Recent results have allowed for private temporal statistics in dynamic databases and time series data. Extending this to the FA case would be another important step in making these data collection techniques practical for industry.
2. Secure Aggregation and Distributed DP: FL and FA are blanket terms, they can be instantiated within a variety of privacy models. Such as secure aggregation, where cryptographic methods are used to combine the inputs from all users to compute the exact answer, and distributed differential privacy, where users add a "share" of DP noise to their input. One research direction would be to study these privacy models and their limitations and propose new techniques to deal with them.
3. Supporting FL compression methods with PETs: Effective FL methods hinge on ensuring the communication overhead of clients participating is small, since this cost can be a primary bottleneck for FL systems. Methods often combine the standard FL techniques with quantization of model updates to compress data. However, it is less clear how the aforementioned privacy models can be extended to support compressed or quantized communications. Hence there is a need for the design of compression methods that are compatible with these privacy models.
Alignment with EPSRC research themes: ICT networks and distributed systems; Artificial Intelligence (Trustworthy Autonomous Systems); Digital Economy (Trust, identity, privacy and security); Global Uncertainties (Cybersecurity)

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/W523793/1 01/10/2021 30/09/2025
2598750 Studentship EP/W523793/1 04/10/2021 27/12/2025 Samuel Maddock