Relaxed Semantics Across the Data Analytics Stack
Lead Research Organisation:
Queen's University Belfast
Department Name: Sch of Electronics, Elec Eng & Comp Sci
Abstract
The RELAX European Doctoral Network aims to train a cohort of highly mobile and adaptable researchers to become experts in the design of scalable and efficient data-intensive software systems. These experts will master the specific skill of navigating the semantics or correctness conditions of applications, with the goal of enhancing scalability, response times, and availability. Working across the disciplinary specialisms of data science, data management, distributed computing and computing systems, the Fellows will develop knowledge of the broad issues underpinning data analytics systems. The bespoke training programme fosters intellectual enquiry and combines technical and scientific research training with courses in innovation, management and leadership. The training network addresses a critical skills gap in data analytics expertise, which needs urgently addressed to support innovation and employment in a fast-growing European data economy. The 8 partner organisations representing 7 countries will benefit first-hand through intersectoral collaboration and an Open Innovation model.
| Description | Key findings are grouped by doctoral candidate. DC10 has been focused on investigating the complex terrain of decision making within urban analytics exploring trade-offs between effectiveness, efficiency and responsible operation. As a first step towards this, scheduling of taxis within ride-hailing services was considered. It was identified that most current literature are focused on optimizing for utilitarian metrics such as total time spent waiting. Through empirical evaluations, it was identified that such utilitarian metrics could lead to highly unfair load allocations between drivers. Towards mitigating this, two pathways were attempted, both in initial results stage. The first is on including a fairness metric along with the utilitarian optimisation objective. The second was on injecting randomness in allocation, whereby the ride is not allocated to the closest driver, but to one of several drivers in the vicinity. Some promising results were obtained in terms of achieving a good trade-off between allocation fairness and utilitarian metrics, but more work is to be done to get this to a publishable form. DC11 aims to investigate a novel interactive and intelligent method for data exploration. In particular, we are currently aiming at explaining outcomes of deep time series classification models. Most common existing techniques rely on an attribute map which explains the roles of every single time point on the final prediction output. However, it creates a huge amount of information which does not suit human interpretation. More importantly, temporal relationships, a key characteristic of time series, are entirely ignored. A few works employ the concept of segmenting time series before explaining roles of segments to preserve temporal relationships. However, these predefined segments can be misleading due to their potential conflicts with the model behavior. Hence, we introduce InteDisUX with an entirely different approach. A key idea of InteDisUX is to actively search for an optimal explanation guided by behaviors of the model and the shape of time series itself. By this way, the best explanation segment not only fits better with the internal mechanism of the model but also fits with internal characteristics of time series. InteDisUX also employ Integrated Gradient with well-founded theory to back up its explanation mechanism. Hence, it outperforms all existing work on 12 different time series datasets and 5 barebones models in terms of explanation robustness and faithfulness. Our research has been accepted in AAAI 2025, a top-tier conference in AI. DC12 is investigating the application of relaxed synchronisation in large-scale graph processing. Focussing on the single-source shortest path (SSSP) problem, a key graph analytics problem that is representative of many others. SSSP algorithms typically sort vertices in order of priority (distance to the source) and processes the highest priority vertices first. We identified that state-of-the-art approaches either process one priority level at a time, and run into a barrier synchronisation bottleneck causing delays and thread idle times, or they order all vertices in a common concurrent priority queue, and incur a substantial overhead in frequent but short synchronisation operations each time they access the priority queue. We proposed a solution that avoid these two performance-degrading synchronisation constructs. Instead, we sort the vertices by priority locally for each thread that modifies the priority of a vertex. Work is shared using work stealing, i.e., when a thread has processed its highest-priority slice of vertices, it inspects the slice of highest-priority vertices of other threads and steals those at the highest priority across all threads. If none are found, the thread proceeds with its lower-priority vertices. This allows us to relax the priority order and to keep threads busy processing vertices, as well as performing the bulk of operations on a thread-local data structure, which is more efficient. Substantial speedups are noted. This research underpinned a submission to the FastCode Programming Challenge, where it achieved a 3x higher throughput compared to the runner-up. |
| Exploitation Route | The research of various PhD students is highly different potential. The research on graph processing is relevant to web and social media applications, advertising, but also social sciences and bioinformatics. The research on fairness and explainability of AI has broad relevance everywhere AI is used. The demonstrated applications of this research to disease outbreak prediction, cybersecurity and sediment transportation have societal relevance. |
| Sectors | Digital/Communication/Information Technologies (including Software) |
| Description | QUB-Kerala symposium on AI and co-operativism |
| Amount | £18,000 (GBP) |
| Organisation | Department for the Economy, Northern Ireland |
| Sector | Public |
| Country | United Kingdom |
| Start | 03/2024 |
| End | 03/2024 |
| Description | Participation in the FastCode Programming Challenge (FCPC 2025) |
| Organisation | Chalmers University of Technology |
| Department | Department of Computer Science and Engineering |
| Country | Sweden |
| Sector | Academic/University |
| PI Contribution | The Fastcode Challenge is a student programming competition, and a subsequent workshop at PPoPP, where outstanding submissions are invited to submit papers and give invited talks. The goal of the competition is to engage more students in learning parallel algorithms and programming, offering them resources and support to enhance their skills, and cultivating students' interests in writing fast code. We hope these efforts will inspire more students to study parallel programming and parallel computing research, thereby making a positive impact on their future careers. We encourage students from all levels to participate. See https://fastcode.org/events/fastcode-challenge/ We participated in a collaboration with researchers at Chalmers University, Sweden. The submission won a first place in the track on the Single-Source Shortest Path problem, where we achieved a 3x higher throughput compared to the second placed submission. A short publication on this will follow but will be published only after this year's submission period deadline. The submission is based on research performed by PhD student Marco d'Antonio at Queens which forms the basis of the proposed solution. Marco additionally developed a formula to set the "delta" parameter, which is a common priority coarsening parameter in single-source shortest path algorithms. |
| Collaborator Contribution | Chalmers University contributed to the submission by exploring the selective use of different algorithms depending on characteristics of the input data set, and with general performance analysis and tuning. |
| Impact | A paper in the FastCode Programming Challenge workshop, co-located with the ACM International Symposium on Principles and Practice on Parallel Programming, 2025. M. d'Antonio, K. von Geijer, T. Son Mai, P. Tsigas, H. Vandierendonck, "Relax and don't Stop: Graph-aware Asynchronous SSSP", In: FastCode Programming Challenge workshop, 2025 (to appear). |
| Start Year | 2024 |
| Description | NI Science Festival exhibition "AI and the Computing that Drives It" |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | Local |
| Primary Audience | Public/other audiences |
| Results and Impact | Even description: "Artificial Intelligence is everywhere. We find it in algorithms prioritising your social media feed, Chat-GPT revising your writing, or your satnav taking you where you need to be. But how does it work? In this session, we will explore some of the principles behind AI, and some of the ways it can fail. We will also explore the computing infrastructure that underpins AI and its hunger for high-end computing and extensive energy consumption." We had about 40-50 visitors from across the general audience, ranging from school-going youth to pensioners. There were some practitioners as well as colleagues from different faculties at Queens University Belfast. The primary outcomes were raising awareness, and changes in opinion in the audience. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://nisciencefestival.com/events/artificial-intelligence-and-the-computing-that-drives-it |
