New Horizons in Multivariate Preprocessing (MULTIPROCESS)

Lead Research Organisation: University of Warwick

Department Name: Computer Science

Abstract

Data preprocessing or data compression is a ubiquitous strategy to cope with computational hardness in various real world applications. The goal here is to efficiently compute a succinct representation of the original data with only a small loss in the accuracy of the information represented within it. This is achieved through various steps such as simplifying the structure of the input, reducing the size or dimension, removing redundant constraints and so on.

The widespread recognition of the importance of preprocessing algorithms in practice has motivated the challenging task of developing mathematical frameworks within which it is possible to conduct rigorous analysis, provide performance guarantees and compare the quality of various preprocessing algorithms proposed for specific computational problems. A partial answer to this challenge was provided in the area of parameterized complexity (or multivariate complexity) through the framework of kernels. This framework has lead to a vibrant new subfield of algorithmics -- Kernelization. The field of kernelization has turned out to be a resounding success over the last two decades as far as the analysis and understanding of 'lossless' preprocessing for NP-hard (computational problems that are not expected to have theoretically efficient, i.e., so called polynomial-time algorithms) decision problems is concerned. However, in practice, preprocessing is heavily used for large data sets of problems that have theoretically efficient (polynomial-time) algorithms as well as for NP-hard problems where one wants to optimize some objective function over the input data. In these situations, the framework of kernelization falls woefully short and does not combine well with approximation algorithms or heuristics.

This proposal aims to make major advances in the mathematical theory of preprocessing by delivering new formulations of efficient preprocessing that overcome the aforementioned fundamental limitations that plague the classic theory of polynomial-time preprocessing. This project will lead to novel preprocessing heuristics and analyses for basic computational problems and extend the scope of rigorous preprocessing analysis to high-impact big data paradigms such as streaming algorithms. This will be achieved by delivering new preprocessing design techniques as well as new mathematical machinery that allows one to formally characterize the limitations of preprocessing for various problems. This project will bring about a paradigm shift by laying the foundations for a new theory of preprocessing that will be able to abstract and unify all efficient preprocessing. This is but the beginning of a timely journey into a mostly unexplored universe teeming with possibilities and undiscovered connections between diverse subfields of computer science.

Funded Value:

£545,674

Funded Period:

Jan 22 - Jun 26

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/V044621/1

Principal Investigator:

Ramanujan Maadapuzhi Sridharan

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (100%)

Organisations

University of Warwick (Lead Research Organisation)

People	ORCID iD
Ramanujan Maadapuzhi Sridharan (Principal Investigator)
Graham Cormode (Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 > >|

10 25 50

Agrawal A (2023) Approximately Interpolating Between Uniformly and Non-Uniformly Polynomial Kernels

Agrawal A (2022) A Fixed-Parameter Tractable Algorithm for Elimination Distance to Bounded Degree Graphs in SIAM Journal on Discrete Mathematics

Argyrios Deligkas (2024) Parameterized Algorithms for Coordinated Motion Planning: Minimizing Energy

Bentert M (2025) Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)

Blažej V (2023) Implementation and Application of Automata - 27th International Conference, CIAA 2023, Famagusta, North Cyprus, September 19-22, 2023, Proceedings

Blažej V (2024) Constrained and Ordered Level Planarity Parameterized by the Number of Levels

Blažej V (2024) Equitable Connected Partition and Structural Parameters Revisited: N-Fold Beats Lenstra

Boigelot B (2023) Implementation and Application of Automata - 27th International Conference, CIAA 2023, Famagusta, North Cyprus, September 19-22, 2023, Proceedings

Eiben E (2023) Finding a Highly Connected Steiner Subgraph and its Applications

Eiben E (2022) An Improved Time-Efficient Approximate Kernelization for Connected Treedepth Deletion Set

Key Findings


Description	In the paper "Approximately Interpolating Between Uniformly and Non-Uniformly Polynomial Kernels" (appeared at FSTTCS 2023), we explore two different approaches to simplifying parameterized problems: uniformly polynomial kernels (where the simplified problem size is bounded by a fixed polynomial independent of input specifics ) and non-uniformly polynomial kernels (where the simplified problem size is bounded by a polynomial whose degree could depend on some input specifics). We introduce a new method that bridges these two approaches, creating a theoretical framework that smoothly transitions between them. This helps researchers better understand the relationship between these two kernelization methods, contributing valuable insights to the theory of parameterized complexity. In "FPT Approximations for Packing and Covering Problems Parameterized by Elimination Distance and Even Less" (appeared at FSTTCS 2023), we authors develop general results showing that many classical graph problems can be efficiently approximated when using certain structural measures (like elimination distance to H or H-treewidth) as parameters. These structural measures can be much smaller than traditional parameters, making it possible to efficiently approximate solutions even when standard approaches aren't practical. This significantly expands the range of problems where efficient approximations are possible through parameterized complexity techniques. The paper "Meta-theorems for Parameterized Streaming Algorithms" (appeared at SODA 2024) introduces foundational results at the intersection of parameterized complexity and streaming algorithms, focusing on semi-streaming algorithms for parameterized graph problems. We develop a framework that combines ideas from both fields, enabling efficient algorithms for a variety of problems in the streaming model. In the process, we demonstrate the power of combining structural graph properties with streaming paradigms, opening up new directions for research in both areas. The paper "Exponential-Time Approximation Schemes via Compression" (appeared at ITCS 2024) presents a novel framework for designing exponential-time approximation schemes for graph partitioning problems, such as k-way cut, Multiway Cut, Steiner k-cut, and Multicut, where one aims to minimize the number of edges crossing between different parts of a partition. The framework combines classic techniques such as randomized contractions and greedy choices with more recent techniques originating in the area of exact algorithms, to achieve efficient approximations. It exploits structural properties of graphs to reduce problem complexity while maintaining near-optimal solutions. This work advances the field of exponential-time algorithms by demonstrating how compression techniques can be leveraged to design efficient approximation schemes. The paper "On the Parameterized Complexity of Eulerian Strong Component Arc Deletion" (appeared at IPEC 2024) studies the problem of modifying a directed graph by deleting the smallest number of arcs so that every strongly connected part of the graph becomes Eulerian (where each node has the same number of incoming and outgoing edges). We show that it is unlikely we can obtain an efficient (FPT) algorithm for this problem if we only focus on the size of the solution (the number of arcs to delete) as a parameter. However, we demonstrate that certain structural properties of the graph, like sparsity (measured by treewidth) or simplicity (graphs without multiple arcs between nodes), can make the problem relatively easy to solve. This work provides new theoretical insights into the complexity of the problem and contributes to research on modifying directed graphs efficiently.
Exploitation Route	The current set of outputs from the project primarily have academic impact. The scientific output presented at high impact venues in the field, collaborative research efforts and dissemination activities have contributed towards strengthening UK research in algorithms and complexity.
Sectors	Other

Abstract

Organisations

People

ORCID iD

Publications