Antecedents and Consequences of Trust in Artificial Agents
Lead Research Organisation:
University of Kent
Department Name: Sch of Psychology
Abstract
Machines powered by artificial intelligence (AI) are revolutionising the social world. We rely on AI when we check the traffic on Google Maps, when we connect with a driver on Uber, or when we apply for a credit check. But as the technological sophistication of AI increases, so too are the number and type of tasks that we rely on AI agents for - for example, to allocate scarce medical resources and assist with decisions about turning off life support, to recommend criminal sentences, and even to identify and kill enemy soldiers. AI agents are approaching a level of complexity that progressively requires them to embody not just artificial intelligence but also artificial morality, making decisions that would be directly described as moral or immoral if made by humans.
The increased use of AI agents has the potential for tremendous economic and social benefits, but for society to reap these benefits, people need to be able to trust these AI agents. While we know that trust is critical, we know very little about the specific antecedents and consequences of such trust in AI, especially when it comes to the increasing use of AI in morally-relevant contexts. This is important because morality is far from simple: We live in a world replete with moral dilemmas, with different ethical theories favouring different mutually exclusive actions. Previous work in humans shows that we use moral judgments as a cue for trustworthiness, so that it is not enough to just ask whether we trust someone to make moral decisions: we have to consider the type of moral decision they are making, how they are making it, and in what context. If we want to understand trust in AI, we need to ask the same questions - but there is no guarantee that the answers will be the same.
We need to understand how trust in AI depends depend on what kind of moral decision they are making (e.g. consequentialist or deontological judgments: Research Question #1) how they are making it (e.g. based on a coarse and interpretable set of decision rules or "black box" machine learning: Research Question #2), and in what relational and operational context (e.g. whether the machine performs close, personal tasks or abstract, impersonal ones, Research Question #3).
In this project I will conduct 11 experiments to investigate how trust in AI is sensitive to what moral decisions are made; how they are made; and in what relational contexts. I will use a number of different experimental approaches tapping both implicit and explicit trust and recruit a range of populations (British laypeople; trained philosophers and AI industry experts; a study with a convenience sample of participants all around the world; and an international experiment with participants representative for age and gender recruited simultaneously in 7 countries). At the end of the grant period, I will host a full-day interdisciplinary conference/workshop consisting of both academic and non-academic attendees to bring together experts working in AI together to consider the psychological challenges of programming trustworthy AI and the philosophical issues of using public preferences as a basis for policy relating to ethical AI.
This work will have important theoretical and methodological implications for research on the antecedents and consequences of trust in AI, highlighting the necessity of moving beyond simply asking whether we could trust AI to instead ask what types of decisions will we trust AI to make, what kinds of AI system we want making moral decisions, and in what contexts. These findings will have significant societal impact in helping public experts working on AI understand the how, when, and why people trust AI agents, allowing us to reap the economic and social benefits of AI that are fundamentally predicated on them being trusted by the public.
The increased use of AI agents has the potential for tremendous economic and social benefits, but for society to reap these benefits, people need to be able to trust these AI agents. While we know that trust is critical, we know very little about the specific antecedents and consequences of such trust in AI, especially when it comes to the increasing use of AI in morally-relevant contexts. This is important because morality is far from simple: We live in a world replete with moral dilemmas, with different ethical theories favouring different mutually exclusive actions. Previous work in humans shows that we use moral judgments as a cue for trustworthiness, so that it is not enough to just ask whether we trust someone to make moral decisions: we have to consider the type of moral decision they are making, how they are making it, and in what context. If we want to understand trust in AI, we need to ask the same questions - but there is no guarantee that the answers will be the same.
We need to understand how trust in AI depends depend on what kind of moral decision they are making (e.g. consequentialist or deontological judgments: Research Question #1) how they are making it (e.g. based on a coarse and interpretable set of decision rules or "black box" machine learning: Research Question #2), and in what relational and operational context (e.g. whether the machine performs close, personal tasks or abstract, impersonal ones, Research Question #3).
In this project I will conduct 11 experiments to investigate how trust in AI is sensitive to what moral decisions are made; how they are made; and in what relational contexts. I will use a number of different experimental approaches tapping both implicit and explicit trust and recruit a range of populations (British laypeople; trained philosophers and AI industry experts; a study with a convenience sample of participants all around the world; and an international experiment with participants representative for age and gender recruited simultaneously in 7 countries). At the end of the grant period, I will host a full-day interdisciplinary conference/workshop consisting of both academic and non-academic attendees to bring together experts working in AI together to consider the psychological challenges of programming trustworthy AI and the philosophical issues of using public preferences as a basis for policy relating to ethical AI.
This work will have important theoretical and methodological implications for research on the antecedents and consequences of trust in AI, highlighting the necessity of moving beyond simply asking whether we could trust AI to instead ask what types of decisions will we trust AI to make, what kinds of AI system we want making moral decisions, and in what contexts. These findings will have significant societal impact in helping public experts working on AI understand the how, when, and why people trust AI agents, allowing us to reap the economic and social benefits of AI that are fundamentally predicated on them being trusted by the public.
Organisations
Publications
Capraro V
(2024)
The impact of generative artificial intelligence on socioeconomic inequalities and policy making.
in PNAS nexus
Myers S
(2024)
People expect artificial moral advisors to be more utilitarian and distrust utilitarian moral advisors.
in Cognition
| Description | In this New Investigator Grant, I sought to conduct research on the antecedents and consequences of trust in artificial agents. This grant was written in 2020-2021, and since I began this work the landscape of AI has changed dramatically - and correspondingly, so has some of the work. While the specific content has developed in line with the fast-moving progress in the field, the spirit of the research has stayed the same, and I have successfully met many of my planned objectives. I have received a 6-month no-cost extension, after which I am confident that all objectives will be met. My first research question concerned how trust is in artificial agents is sensitive to what kind of moral decision is made (RQ1). Our results - published in the journal Cognition - show that people have a significant aversion to AI moral advisors (vs humans) giving moral advice, while also showing that this is particularly the case when advisors - human and AI alike - gave advice based on utilitarian principles. Across four experiments, we find that participants expect AI to make utilitarian decisions, and that even when participants agreed with a decision made by an AI advisor, they still expected to disagree with an AI more than a human in future. In line with RQ1, then, it is clear that the kind of moral decisions that AI are finetuned to make will have significant consequences for trust. This suggests challenges in the adoption of artificial moral advisors, and particularly those who draw on and endorse utilitarian principles - however normatively justifiable. We are now in the process of testing this across 10 countries (translations and legal contracts with recruitment platforms are nearly completed), and I expect to be able to share these results in the final grant report. My second research question concerned how trust in AI is sensitive to how the system operates (RQ2), and my third research question concerned how trust in AI might be different across contexts (RQ3). In part due to the rapid shift in AI technology during the grant, we decided to investigate the questions together as part of four projects. First, we conducted three experiments using the planned new Budget Allocation task in which we drew on normative accounts of what should matter for trustworthy AI, and then explored to what extent people thought these would be important for them to trust an AI designed for either moral or non-moral ends (RQ2). To look at the effect of context (RQ3), we looked at this in a healthcare, criminal justice, and military contexts. We find that, contrary to the focus of "Explainable AI", participant's own judgments do not prioritise interpretability but instead prioritise different ethical principles in different contexts (RQs 2-3). While we have written up a manuscript we did, however, get feedback about concerns with this work and whether it is fully possible to distinguish these different facets psychologically, and this has held us back from having AI ethicists and developers complete the task and proceeding further at this stage. I am still hoping to revisit this work in a revised form to build on the developments and insights I got in the grant to address some of the theoretical and methodological concerns raised. Second, we looked at the role of interpretability (i.e. being able to see how/why the AI made a given decision) in predicting trust in a more concrete, fleshed out situation (RQ2). We have conducted one study, with a second study to be launched and analysed before the end of the grant. In this work, we had participants read about an AI (vs human) decision maker in either a moral or non-moral context (RQ3), and varied whether the procedure by which the decision was made was interpretable or not (RQ2). We found that across both types of context, when the decision was made by an AI, and when the decision was uninterpretable, the procedure was rated as less fair and less ethical - though there was no interaction between the two, so it wasn't seen as particularly bad for an AI (vs human) to not be interpretable. This suggests that while interpretability might be important in general, it is not seen as particularly important for AI, in contrast to previous suggestions. Third, we have looked at the increasingly important - yet understudied - context of AI-assisted policy making. Building on advances in generative AI and the corresponding increase of discussion of AI-assisted policy, across four studies we presented participants with various "features" of AI that prioritised either effectiveness or ethicality in policy-making (RQ2), and varied the specific political domain that the AI would be used for - for healthcare funding, immigration, and climate change (RQ3). We find that across these policy contexts, participants both prioritise features that indicate ethicality, and (mistakenly) presume that an effective policy-making AI would also be ethical. This work is currently under review. Fourth and finally, we looked at how basic social cognition processes about how people think AI works (RQ2) lead people to misunderstand AI capacities and can cause unwarranted trust. This work was unplanned in the original grant, but addresses RQ2 and became important with the release of publicly available large language models and the correspondingly rapidly evolving face of AI over the course of the grant. In AI safety research, the orthogonality thesis refers to the idea that morality (or moral alignment) and competence (or general intelligence) are orthogonal: an AI becoming more intelligent does not guarantee it also becomes more moral and therefore less dangerous. In this line of research that was developed during the grant, we examined the psychological basis of the orthogonality thesis, exploring how people perceive the relationship between morality and intelligence in both humans and AI (RQ2), and what consequences this has for trust and perceptions of danger. Across 9 experiments (total n = 3,895) using different methods, we find a persistent link between perceptions of intelligence and morality of AI agents that, while shared with perceptions of humans, did not have the same consequences for trust. We use these findings to highlight the concerning implications for AI safety and the epistemic risk of uncritical narratives around the technological progress of AI. This work is currently under review. Finally, I co-wrote a review published at PNAS Nexus with an interdisciplinary group of authors (including two Nobel prize winners) on the impact of generative artificial intelligence on socioeconomic inequalities and policy making, which is already generating a significant amount of citations and impact. My overarching objective for the grant was "To become established as a pre-eminent researcher in the moral psychology of AI by forming, training, and leading a team to implement the project's scientific and impact objectives". I believe I have met this objective. I successfully supervised a PDRA, helping him obtain his first first-authored publication and first conference presentation, and subsequently mentored him into a new PostDoctoral position. I have completed a Research Leadership training course from AdvanceHE, and also completed the engagement and networking events planned. I completed a visit to the University of British Columbia in Canada, visited the Centre for Humans and Machine in Berlin and shared my research, and I hosted a successful workshop on Moral Psychology and AI at the University of Kent with over 70 attendees from different academic disciplines and industry. I have broadly improved my research skills, though, due to how the design of the Budget Task developed, I did not do a course in multinomial choice statistical modelling as originally planned. Finally, working on this grant and developing my research has led to me receiving an ERC Starting Grant (fulfilled through the EPSRC) conducting a bigger and longer-term investigation of trust in moral machines. |
| Exploitation Route | To advance our understanding of trust in AI and the psychological and ethical risks. |
| Sectors | Digital/Communication/Information Technologies (including Software) |
| Description | I am still developing impact from the grant, and expect to be able to provide more specifics in the coming year when the grant is finished. I have organised a workshop at Kent which had 70+ attendees from different academic disciplines, and also included people working in industry. This allowed my PostDoc to present some of our results to people working in the field, and has subsequently sparked new collaborations and connections across disciplines. I have, additionally, presented findings from this project in five invited talks in five different countries, and additionally presented this work at the largest international society in my field. To build on impact and organise a larger, longer, and more public-facing event to share some of the findings from the grant, I have submitted a conference grant to the British Academy which is currently under review. I fully expect these findings to make a strong academic impact, and I am also hopeful that this work - especially the work on trust in policy-making AI, and the work on misleading AI narratives - will also make an impact on public discussion about these important issues. To do this, I will write opinion pieces and popular summaries based on the research to reach a larger audience. |
| First Year Of Impact | 2025 |
| Sector | Digital/Communication/Information Technologies (including Software) |
| Impact Types | Societal |
| Description | A Person-Centred Approach to Understanding Trust in Moral Machines |
| Amount | £1,447,644 (GBP) |
| Funding ID | EP/Y00440X/1 |
| Organisation | United Kingdom Research and Innovation |
| Sector | Public |
| Country | United Kingdom |
| Start | 01/2024 |
| End | 12/2028 |
| Description | Moral Psychology of AI Workshop at Kent |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Other audiences |
| Results and Impact | This single day workshop/conference brought together academics and industry researchers from around the world representing various disciplines including psychology, philosophy, and computer science to share research and provoke discussions surrounding the moral psychology of artificial intelligence. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://blogs.kent.ac.uk/moralpsychai/ |
