Relaxed Semantics Across the Data Analytics Stack

Lead Research Organisation: Queen's University Belfast

Department Name: Sch of Electronics, Elec Eng & Comp Sci

Abstract

The RELAX European Doctoral Network aims to train a cohort of highly mobile and adaptable researchers to become experts in the design of scalable and efficient data-intensive software systems. These experts will master the specific skill of navigating the semantics or correctness conditions of applications, with the goal of enhancing scalability, response times, and availability. Working across the disciplinary specialisms of data science, data management, distributed computing and computing systems, the Fellows will develop knowledge of the broad issues underpinning data analytics systems. The bespoke training programme fosters intellectual enquiry and combines technical and scientific research training with courses in innovation, management and leadership. The training network addresses a critical skills gap in data analytics expertise, which needs urgently addressed to support innovation and employment in a fast-growing European data economy. The 8 partner organisations representing 7 countries will benefit first-hand through intersectoral collaboration and an Open Innovation model.

Funded Value:

£795,754

Funded Period:

Mar 23 - Feb 27

Funder:

Horizon Europe Guarantee

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/X029174/1

Principal Investigator:

Hans Vandierendonck

Research Subject:

Info. & commun. Technol. (96%)

Research Topic:

Artificial Intelligence (32%)

Networks & Distributed Systems (32%)

Parallel Computing (32%)

Organisations

People	ORCID iD
Hans Vandierendonck (Principal Investigator)	http://orcid.org/0000-0001-5868-9259

Publications

Author Name

Title Publication Date Published

10 25 50

Key Findings
Further Funding
Collaboration
Engagement Activities


Description	Key findings are grouped by doctoral candidate. DC10 has been focused on investigating the complex terrain of decision making within urban analytics exploring trade-offs between effectiveness, efficiency and responsible operation. As a first step towards this, scheduling of taxis within ride-hailing services was considered. It was identified that most current literature are focused on optimizing for utilitarian metrics such as total time spent waiting. Through empirical evaluations, it was identified that such utilitarian metrics could lead to highly unfair load allocations between drivers. Towards mitigating this, two pathways were attempted, both in initial results stage. The first is on including a fairness metric along with the utilitarian optimisation objective. The second was on injecting randomness in allocation, whereby the ride is not allocated to the closest driver, but to one of several drivers in the vicinity. Some promising results were obtained in terms of achieving a good trade-off between allocation fairness and utilitarian metrics, but more work is to be done to get this to a publishable form. DC11 aims to investigate a novel interactive and intelligent method for data exploration. In particular, we are currently aiming at explaining outcomes of deep time series classification models. Most common existing techniques rely on an attribute map which explains the roles of every single time point on the final prediction output. However, it creates a huge amount of information which does not suit human interpretation. More importantly, temporal relationships, a key characteristic of time series, are entirely ignored. A few works employ the concept of segmenting time series before explaining roles of segments to preserve temporal relationships. However, these predefined segments can be misleading due to their potential conflicts with the model behavior. Hence, we introduce InteDisUX with an entirely different approach. A key idea of InteDisUX is to actively search for an optimal explanation guided by behaviors of the model and the shape of time series itself. By this way, the best explanation segment not only fits better with the internal mechanism of the model but also fits with internal characteristics of time series. InteDisUX also employ Integrated Gradient with well-founded theory to back up its explanation mechanism. Hence, it outperforms all existing work on 12 different time series datasets and 5 barebones models in terms of explanation robustness and faithfulness. Our research has been accepted in AAAI 2025, a top-tier conference in AI. DC12 is investigating the application of relaxed synchronisation in large-scale graph processing. Focussing on the single-source shortest path (SSSP) problem, a key graph analytics problem that is representative of many others. SSSP algorithms typically sort vertices in order of priority (distance to the source) and processes the highest priority vertices first. We identified that state-of-the-art approaches either process one priority level at a time, and run into a barrier synchronisation bottleneck causing delays and thread idle times, or they order all vertices in a common concurrent priority queue, and incur a substantial overhead in frequent but short synchronisation operations each time they access the priority queue. We proposed a solution that avoid these two performance-degrading synchronisation constructs. Instead, we sort the vertices by priority locally for each thread that modifies the priority of a vertex. Work is shared using work stealing, i.e., when a thread has processed its highest-priority slice of vertices, it inspects the slice of highest-priority vertices of other threads and steals those at the highest priority across all threads. If none are found, the thread proceeds with its lower-priority vertices. This allows us to relax the priority order and to keep threads busy processing vertices, as well as performing the bulk of operations on a thread-local data structure, which is more efficient. Substantial speedups are noted. This research underpinned a submission to the FastCode Programming Challenge, where it achieved a 3x higher throughput compared to the runner-up.
Exploitation Route	The research of various PhD students is highly different potential. The research on graph processing is relevant to web and social media applications, advertising, but also social sciences and bioinformatics. The research on fairness and explainability of AI has broad relevance everywhere AI is used. The demonstrated applications of this research to disease outbreak prediction, cybersecurity and sediment transportation have societal relevance.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	QUB-Kerala symposium on AI and co-operativism
Amount	£18,000 (GBP)
Organisation	Department for the Economy, Northern Ireland
Sector	Public
Country	United Kingdom
Start	03/2024
End	03/2024


Description	Participation in the FastCode Programming Challenge (FCPC 2025)
Organisation	Chalmers University of Technology
Department	Department of Computer Science and Engineering
Country	Sweden
Sector	Academic/University
PI Contribution	The Fastcode Challenge is a student programming competition, and a subsequent workshop at PPoPP, where outstanding submissions are invited to submit papers and give invited talks. The goal of the competition is to engage more students in learning parallel algorithms and programming, offering them resources and support to enhance their skills, and cultivating students' interests in writing fast code. We hope these efforts will inspire more students to study parallel programming and parallel computing research, thereby making a positive impact on their future careers. We encourage students from all levels to participate. See https://fastcode.org/events/fastcode-challenge/ We participated in a collaboration with researchers at Chalmers University, Sweden. The submission won a first place in the track on the Single-Source Shortest Path problem, where we achieved a 3x higher throughput compared to the second placed submission. A short publication on this will follow but will be published only after this year's submission period deadline. The submission is based on research performed by PhD student Marco d'Antonio at Queens which forms the basis of the proposed solution. Marco additionally developed a formula to set the "delta" parameter, which is a common priority coarsening parameter in single-source shortest path algorithms.
Collaborator Contribution	Chalmers University contributed to the submission by exploring the selective use of different algorithms depending on characteristics of the input data set, and with general performance analysis and tuning.
Impact	A paper in the FastCode Programming Challenge workshop, co-located with the ACM International Symposium on Principles and Practice on Parallel Programming, 2025. M. d'Antonio, K. von Geijer, T. Son Mai, P. Tsigas, H. Vandierendonck, "Relax and don't Stop: Graph-aware Asynchronous SSSP", In: FastCode Programming Challenge workshop, 2025 (to appear).
Start Year	2024


Description	NI Science Festival exhibition "AI and the Computing that Drives It"
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Public/other audiences
Results and Impact	Even description: "Artificial Intelligence is everywhere. We find it in algorithms prioritising your social media feed, Chat-GPT revising your writing, or your satnav taking you where you need to be. But how does it work? In this session, we will explore some of the principles behind AI, and some of the ways it can fail. We will also explore the computing infrastructure that underpins AI and its hunger for high-end computing and extensive energy consumption." We had about 40-50 visitors from across the general audience, ranging from school-going youth to pensioners. There were some practitioners as well as colleagues from different faculties at Queens University Belfast. The primary outcomes were raising awareness, and changes in opinion in the audience.
Year(s) Of Engagement Activity	2024
URL	https://nisciencefestival.com/events/artificial-intelligence-and-the-computing-that-drives-it

Abstract

Organisations

People

ORCID iD

Publications