Exploring Trust in AI Enabled Systems

Lead Research Organisation: University College London
Department Name: UCL Interaction Centre

Abstract

While methods of evaluating the efficacy of explainable techniques in AI systems are numerous and varied, measures of explainability or interpretability (a common taxonomy is lacking) rarely consider the issue of trust. Previous studies by Pieters (2010) and Miller (2019) show that this relationship is not as trivial as it seems, as, for example, complete but unsound explanations improve the user's mental model but reduce trust, while explanations with a level of detail too high fail to elicit any trust at all. The aim of the study is to explore more effective ways of measuring trust in AI systems and extending the scope of the research to the distinction between cognitive and affective trust in service relationships. While the first is a user's confidence or willingness to rely on a service provider's competence and reliability derived from accumulated knowledge, the second can be described as the confidence one places in an entity based on feelings generated by the level of care and concern the entity demonstrates. Based on this distinction, should the evaluation of user trust in the safety and security of an artificially intelligent system such as an autonomous agent adopt techniques from the literature on affective trust? Specific mentions to trust in automation is found in Rempel et al. (1985), who claim that trust evolves alongside the three dimensions of predictability, dependability, and faith. However, other works seem to disagree on what these dimensions are, pointing to trial-and-error experience, understanding, and faith (Zuboff, 1988), dependability, and faith (McKnight et al., 2002), and experience, understandability, and observability (Rogers, 2003). Therefore there are multiple factors to take into account, including the question of to whom a machine learning system might be interpretable. One stakeholder particularly relevant to safety and security, which is missing in the existing explainability literature, is that of attackers. Here, further research will focus on building uncertainty awareness inside models in order to improve the robustness of the system. However, since transparency-based explanation techniques are potentially more exploitable by attackers than post-hoc techniques, the study will also explore and evaluate the security of XAI techniques themselves in the context of a trade-off with explainability. There are many directions that this project take and it is planned to evaluate and compare current XAI techniques and to design novel formal techniques addressing numerous stakeholders to the measurement of trust in human-AI systems.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509577/1 30/09/2016 24/03/2022
2407974 Studentship EP/N509577/1 30/09/2020 29/09/2024 Federico Milana
EP/T517793/1 30/09/2020 29/09/2025
2407974 Studentship EP/T517793/1 30/09/2020 29/09/2024 Federico Milana