PERSEVERE: A Rigorous Foundation for Persistent Verification
Lead Research Organisation:
Imperial College London
Department Name: Computing
Abstract
When your PC power is cut off, or your mobile phone runs out of battery, anything you were working on at the time (e.g. a game of Solitaire) is lost forever. This is because the RAM, the memory your device uses to store data temporarily, is wiped clean whenever it loses power. However, documents you saved on your PC or the photos you took on your phone are still on your device when it switches back on. This is because they are saved to the hard drive, which is permanent storage that keeps its data even if it loses power.
Recently, technology companies have manufactured a new type of RAM, non-volatile memory or NVM, that does not lose its data when it loses power. This means that when a device switches back on after losing power or something catastrophic like a crash, its data is still available on NVM; we say that its data persists. This means that with careful engineering we may recover the data and not lose our work.
However, this is not quite so straightforward. Modern devices are very fast, and to do this they use clever methods to get work done efficiently.
For example, they do multiple tasks all at once or do a list of tasks in a different order. Sometimes this means leaving your data in a bad state temporarily and fixing it later, e.g. deleting your old data before saving a new version. If we lose power during a bad temporary state (e.g. after deleting old data but before saving its replacement), then when we restart the device we may recover bad data. This has many unfortunate consequences, from simply losing data to causing errors in our software.
My research project will solve these problems, by studying NVM use from the perspective of hardware (e.g. our phones), software (e.g. our phone apps), and theoretical analysis. I will develop new tools and techniques that will help us build persistent technology, and then use formal (mathematical) methods to prove that these tools and techniques are safe and correct.
My proposed research has three key components. First, I will create NVM 'persistency models', which are rigorous ways of describing exactly what NVM can/cannot do, with mathematical precision. I will then use specialised tools to test Intel and ARM microchips (in our PCs and phones) against my models, and see how they behave when using NVM. Verifying that real-world hardware behaves as expected is an important step towards safe and reliable NVM, as it provides a safe foundation to write software on top of.
Second, I will extend modern programming languages to enable writing programs (software) that can control how data persists to NVM, which in turn makes it easier and safer to recover NVM data. Currently it is impossible to write such programs, because NVM is such a new concept that persistence control is not a part of modern programming languages. I will extend these languages and provide example programs and tests. I will then prove that these extensions are correct so that software companies can rely on them to build their future products.
Finally, I will develop ways to test and verify that programs safely recover NVM data. Testing is an important part of hardware and software development, but testing NVM persistency is currently infeasible: the only way to do this currently is to run thousands of tests, each time cutting the power at different times. However, forcing such frequent power losses is both impractical and inefficient. I will develop new ways to test NVM persistency, which is the final key step for widespread NVM adoption.
NVM could save untold amounts of data, money and time every year. Data loss is faced by not only people who use computers every day, but also data centres and safety-critical technologies worldwide. NVM can make data loss a thing of the past, but requires a rigorous, safe foundation to be built on, to avoid trading one kind of unpredictability for another. This research project will ensure that foundation, and unleash the potential of this new technology.
Recently, technology companies have manufactured a new type of RAM, non-volatile memory or NVM, that does not lose its data when it loses power. This means that when a device switches back on after losing power or something catastrophic like a crash, its data is still available on NVM; we say that its data persists. This means that with careful engineering we may recover the data and not lose our work.
However, this is not quite so straightforward. Modern devices are very fast, and to do this they use clever methods to get work done efficiently.
For example, they do multiple tasks all at once or do a list of tasks in a different order. Sometimes this means leaving your data in a bad state temporarily and fixing it later, e.g. deleting your old data before saving a new version. If we lose power during a bad temporary state (e.g. after deleting old data but before saving its replacement), then when we restart the device we may recover bad data. This has many unfortunate consequences, from simply losing data to causing errors in our software.
My research project will solve these problems, by studying NVM use from the perspective of hardware (e.g. our phones), software (e.g. our phone apps), and theoretical analysis. I will develop new tools and techniques that will help us build persistent technology, and then use formal (mathematical) methods to prove that these tools and techniques are safe and correct.
My proposed research has three key components. First, I will create NVM 'persistency models', which are rigorous ways of describing exactly what NVM can/cannot do, with mathematical precision. I will then use specialised tools to test Intel and ARM microchips (in our PCs and phones) against my models, and see how they behave when using NVM. Verifying that real-world hardware behaves as expected is an important step towards safe and reliable NVM, as it provides a safe foundation to write software on top of.
Second, I will extend modern programming languages to enable writing programs (software) that can control how data persists to NVM, which in turn makes it easier and safer to recover NVM data. Currently it is impossible to write such programs, because NVM is such a new concept that persistence control is not a part of modern programming languages. I will extend these languages and provide example programs and tests. I will then prove that these extensions are correct so that software companies can rely on them to build their future products.
Finally, I will develop ways to test and verify that programs safely recover NVM data. Testing is an important part of hardware and software development, but testing NVM persistency is currently infeasible: the only way to do this currently is to run thousands of tests, each time cutting the power at different times. However, forcing such frequent power losses is both impractical and inefficient. I will develop new ways to test NVM persistency, which is the final key step for widespread NVM adoption.
NVM could save untold amounts of data, money and time every year. Data loss is faced by not only people who use computers every day, but also data centres and safety-critical technologies worldwide. NVM can make data loss a thing of the past, but requires a rigorous, safe foundation to be built on, to avoid trading one kind of unpredictability for another. This research project will ensure that foundation, and unleash the potential of this new technology.
Organisations
- Imperial College London (Fellow, Lead Research Organisation)
- Bloomberg (Collaboration)
- The National Institute for Research in Computer Science and Control (INRIA) (Collaboration)
- Korea Advanced Institute of Science and Technology (KAIST) (Collaboration)
- Facebook (Collaboration)
- University of Cambridge (Project Partner)
- ARM (United Kingdom) (Project Partner)
- Max Planck Institutes (Project Partner)
- Amazon (United States) (Project Partner)
- Facebook UK (Project Partner)
People |
ORCID iD |
Azalea Raad (Principal Investigator / Fellow) |
Publications
Cho K
(2023)
Memento: A Framework for Detectable Recoverability in Persistent Memory
in Proceedings of the ACM on Programming Languages
D'Osualdo E
(2023)
The Path to Durable Linearizability
in Proceedings of the ACM on Programming Languages
Le Q
(2022)
Finding real bugs in big programs with incorrectness logic
in Proceedings of the ACM on Programming Languages
Raad A
(2022)
Extending Intel-x86 consistency and persistency: formalising the semantics of Intel-x86 memory types and non-temporal stores
in Proceedings of the ACM on Programming Languages
Raad A
(2022)
Concurrent incorrectness separation logic
in Proceedings of the ACM on Programming Languages
Scodras S
(2022)
Methodological approaches for identifying competencies for the physiotherapy profession: a scoping review.
in Discover education
Description | Automated Generation and Detection of Exploits via Incorrectness Logic |
Amount | $50,000 (USD) |
Funding ID | 354270840232561 |
Organisation | |
Sector | Private |
Country | United States |
Start | 10/2022 |
Description | Research Institute on Verified Trusted Software Systems (VeTSS) |
Amount | £749,986 (GBP) |
Funding ID | RFA 29229/ Agreement No 4234671 |
Organisation | National Cyber Security Centre |
Sector | Public |
Country | United Kingdom |
Start | 11/2022 |
End | 11/2025 |
Description | Collaboration with INRIA |
Organisation | The National Institute for Research in Computer Science and Control (INRIA) |
Country | France |
Sector | Public |
PI Contribution | I developed a formal consistency and persistency model for non-temporal stores and memory types on the Intel-x86 architectures. |
Collaborator Contribution | A colleague at Inria developed an automated testing infrastructure to empirically validate our formal models. |
Impact | This work led to the following publication: https://doi.org/10.1145/3498683 |
Start Year | 2021 |
Description | Collaboration with KAIST |
Organisation | Korea Advanced Institute of Science and Technology (KAIST) |
Country | Korea, Republic of |
Sector | Academic/University |
PI Contribution | Colleagues at KAIST and I developed a new way of automatically converting volatile lock-free (concurrent) algorithms to persistent ones that behave correctly in the context of non-volatile memory. |
Collaborator Contribution | Colleagues at KAIST and I developed a new way of automatically converting volatile lock-free (concurrent) algorithms to persistent ones that behave correctly in the context of non-volatile memory. |
Impact | It resulted in a publication at PLDI 2023 (to appear). |
Start Year | 2022 |
Description | Collaboration with Lacework and Meta |
Organisation | |
Department | Facebook, UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | We worked on several projects, including a concurrent formalism for bug catching, as well as a new tool for detecting memory safety bugs. |
Collaborator Contribution | Testing infrastructure and codebases. |
Impact | This work led to two publications: https://doi.org/10.1145/3498695 https://doi.org/10.1145/3527325 |
Start Year | 2021 |
Description | Consultancy at Bloomberg |
Organisation | Bloomberg |
Country | United States |
Sector | Private |
PI Contribution | We developed a novel theoretical framework based on the idea of under-approximation for detecting security vulnerabilities in software. Our theoretical framework guarantees that any such bugs detected are indeed security vulnerabilities and not false positives. |
Collaborator Contribution | My collaborator provided many interesting and realistic case studies of security vulnerabilities in industry code. This was particularly important as made the contributions highly relevant to the industry. |
Impact | The outcomes are under submission to be published. |
Start Year | 2022 |