📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Explainability of Machine Learning models for Adversarial Robustness

Lead Research Organisation: King's College London
Department Name: Informatics

Abstract

Adversarial attacks no longer only attack the classifier, but can now bypass explanation methods. What was once thought to be an effective defense, explanation methods now give a false sense of security to many high-performing Deep Learning models in the computer vision domain. This has caused security practitioners to be wary that these security flaws could be transferred to the security domain. However, we believe that explanation methods fulfil a major role in the robustness of classifiers against adversarial attacks. By having a deeper understanding of these explainers, we can better utilize them as a defense and forensics tool for not only security, but all machine learning problems. Our central question is "How can explainability of machine learning models improve adversarial robustness in the security domain?". This is broken down in to 4 smaller research questions: "Is the terminology in the literature for the explanation's robustness appropriate?", "Can explanations for computer vision be transferred to the security domain"?, "Can explanation methods help us better understand and tackle practical challenges in security?"," Do adversarial attacks reveal a similar explainer behavior to drift?". Which we aim to answer with 4 main projects: Dataset bias, Explanation Affinity triangle, Drift forensics and Adversarial attacks.

People

ORCID iD

Hoifung Chow (Student)

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T517963/1 30/09/2020 29/09/2025
2608271 Studentship EP/T517963/1 30/09/2021 30/03/2025 Hoifung Chow