SAI: Social Explainable Artificial Intelligence

Lead Research Organisation: University of Sheffield

Department Name: Computer Science

Abstract

The recent wave of machine-learning (ML) based Artificial-Intelligence (AI) technologies is having a huge societal and economic impact, with AI being (often silently) embedded in most of our everyday experiences (such as virtual assistants, tracking devices, social media, recommender systems). The research community (and society in general) has already realised that the current centralised approach to AI, whereby our personal data are centrally collected and processed through opaque ML systems ("black-boxes"), is not an acceptable and sustainable model in the long run. We posit that the "next wave" of ML-driven AI should be (i) human-centric, (ii) explainable, and (iii) more distributed and decentralised (i.e., not centrally controlled). These principles address the societal and ethical expectations for trustworthy, privacy-respectful AI, such as those recommended at the European Level (e.g., human agency, transparency, explainability included in the AI HLEG report on Ethics Guidelines for Trustworthy AI). They also fit a clear trend to develop decentralised ML for strictly technical reasons, e.g., performance, scalability, real-time constraints.
SAI will develop the scientific foundations for novel ML-based AI systems ensuring (i) individuation: in SAI each individual is associated with their own "Personal AI Valet" (PAIV), which acts as the individual's proxy in a complex ecosystem of interacting PAIVs; (ii) personalisation: PAIVs process individuals' data via explainable AI models tailored to the specific characteristics of their human twins; (iii) purposeful interaction: PAIVs interact with each other, to build global AI models and/or come up with collective decisions starting from the local (i.e., individual) models; (iv) human-centricity: novel AI algorithms and the interaction between PAIVs are driven by (quantifiable) models of the individual and social behaviour of their human users; (v) explainability: explainable ML techniques are extended through quantifiable human behavioural models and network science analysis to make both local and global AI models explainable-by-design.

The ultimate goal of SAI is to provide the foundational elements enabling a decentralised collective of explainable PAIVs to evolve local and global AI models, whose processes and decisions are transparent, explainable and tailored to the needs and constraints of individual users. We provide a concrete example of a SAI-enabled scenario.

To this end, the project will deliver (i) the PAIV, a personal digital platform, where every person can privately and safely integrate, store, and extract meaning from their own digital tracks, as well as interact with PAIVs of other users; (ii) human-centric local AI models; (iii) global, decentralised AI models, emerging from human-centric interactions between PAIVs; (iv) personalised explainability at the level of local and global AI models; and (v) concrete use cases to validate the SAI design principles, based on real datasets complemented, when needed, by synthetic datasets obtained from well-established models of human behaviour, in the areas of private traffic management, opinion diffusion/fake news detection in social media, and pandemic tracking and control.

Funded Value:

£293,078

Funded Period:

Feb 21 - Jan 24

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/V055712/1

Principal Investigator:

Nikolaos Aletras

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (100%)

Organisations

University of Sheffield (Lead Research Organisation)

People	ORCID iD
Nikolaos Aletras (Principal Investigator)	http://orcid.org/0000-0003-4285-1965
Kalina Bontcheva (Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 5 6 7 8 > >|

10 25 50

, Ollivier (2022) Structural invariants and semantic fingerprints in the "ego network" of words

Alajrami A (2022) How does the pre-training objective affect what large language models learn about linguistic properties?

Alajrami A (2023) Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?

Bodria F (2021) Benchmarking and Survey of Explanation Methods for Black Box Models

Bodria F (2023) Benchmarking and survey of explanation methods for black box models in Data Mining and Knowledge Discovery

Boldrini C (2022) Models for Digitally Contact-Traced Epidemics in IEEE Access

Brandoli B (2022) From multiple aspect trajectories to predictive analysis: a case study on fishing vessels in the Northern Adriatic sea in GeoInformatica

Key Findings
Research Databases and Models


Description	Research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i.e. explanation or rationale). However, previous research has shown that there is no clear best scoring method across various text classification tasks while practitioners typically have to make several other ad- hoc choices regarding the length and the type of the rationale (e.g. short or long, contiguous or not). Inspired by this, we proposed a simple yet effective and flexible method that allows selecting optimally for each data instance. Evaluation on four standard text classification datasets showed that our proposed method provides more faithful, comprehensive and highly sufficient explanations compared to using a fixed feature scoring method, rationale length and type. More importantly, we demonstrate that a practitioner is not required to make any ad-hoc choices in order to extract faithful rationales using our approach. To the best of our knowledge, no previous work so far has investigated how different pre-training objectives affect what BERT learns about linguistics properties. We find strong evidence that there are only small differences in linguistic inductive bias between the representations learned by linguistically and non-linguistically motivated objectives. Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i.e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i.e. select-then-predict models). Currently, these approaches are largely evaluated on in-domain settings. Yet, little is known about how post-hoc explanations and inherently faithful models perform in out-of-domain settings or under temporal concept drift. Our work has demonstrated that post-hoc rationale extraction methods are not robust out-of-domain or under temporal drifts.
Exploitation Route	We have conducted fundamental research on explanation extraction of natural language processing systems. Non academic routes: Our methods can be used in any setting that uses text as data for assisting in human decision making (i.e. finance, law, journalism, tech industry). Academic routes: Our methods can be used and extended by academics that conduct research in NLP and machine learning.
Sectors	Digital/Communication/Information Technologies (including Software) Financial Services and Management Consultancy Healthcare Government Democracy and Justice


Title	ECtHR-NAACL2021
Description	This dataset is part of the article: Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases. Ilias Chalkidis, Manos Fergadiotis, Dimitris Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos and Prodromos Malakasiotis. In the Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021). June 6-11, 2021. Mexico City, Mexico. The court (ECtHR) hears allegations regarding breaches in human rights provisions of the European Convention of Human Rights (ECHR) by European states. The Convention is available at https://www.echr.coe.int/Documents/Convention_ENG.pdf. The court rules on a subset of all ECHR articles, which are predefined (alleged) by the applicants (plaintiffs). Our dataset comprises 11k ECtHR cases and can be viewed as an enriched version of the ECtHR dataset of Chalkidis et al. (2019), which did not provide ground truth for alleged article violations (articles discussed) and rationales.
Type Of Material	Database/Collection of data
Year Produced	2021
Provided To Others?	Yes
Impact	Part of the LexGLUE benchmark published at ACL 2022 (https://arxiv.org/abs/2110.00976).
URL	http://archive.org/details/ECtHR-NAACL2021/

Abstract

Organisations

People

ORCID iD

Publications