DiPET: Distributed Stream Processing on Fog and Edge Systems via Transprecise Computing
Lead Research Organisation:
Queen's University Belfast
Department Name: Sch of Electronics, Elec Eng & Comp Sci
Abstract
The DiPET project investigates models and techniques that enable distributed stream processing applications to seamlessly span and redistribute across fog and edge computing systems. The goal is to utilize devices dispersed through the network that are geographically closer to users to reduce network latency and to increase the available network bandwidth. However, the network that user devices are connected to is dynamic. For example, mobile devices connect to different base stations as they roam, and fog devices may be intermittently unavailable for computing. In order to maximally leverage the heterogeneous compute and network resources present in these dynamic networks, the DiPET project pursues a bold approach based on transprecise computing. Transprecise computing states that computation need not always be exact and proposes a disciplined trade-off of precision against accuracy, which impacts on computational effort, energy efficiency, memory usage and communication bandwidth and latency. Transprecise computing allows to dynamically adapt the precision of computation depending on the context and available resources. This creates new dimensions to the problem of scheduling distributed stream applications in fog and edge computing environments and will lead to schedules with superior performance, energy efficiency and user experience. The DiPET project will demonstrate the feasibility of this unique approach by developing a transprecise stream processing application framework and transprecision-aware middleware. Use cases in video analytics and network intrusion detection will guide the research and underpin technology demonstrators.
Planned Impact
n/a
Publications
Giménez N
(2024)
The Effects of Weight Quantization on Online Federated Learning for the IoT: A Case Study
in IEEE Access
Koohi Esfahani M
(2021)
Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing
Koohi Esfahani M
(2021)
Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs
Koohi Esfahani M
(2021)
How Do Graph Relabeling Algorithms Improve Memory Locality?
Koohi Esfahani M
(2022)
SAPCo Sort: Optimizing Degree-Ordering for Power-Law Graphs
Koohi Esfahani M
(2021)
Locality Analysis of Graph Reordering Algorithms
| Description | Various video analytics tasks are highly time-consuming and require significant power to complete. This places minimal requirements on the computing hardware that must be used. However, highly capable computing devices reduce battery life, increase heat dissipation, increase weight and form factor. Many video analytics tasks are most accurately solved using deep neural networks, a popular machine learning technique that requires a lot of compute power. We were able to demonstrate that demanding video analytics tasks can be performed by devices with limited computational capacity. We achieved this by creating a new piece of software that controls which out of multiple algorithms is used to analyse a particular frame. This controller selects simple algorithms when the video's content can be analysed easily, and selects complex algorithms only when the video content requires this. Moreover, the system can adapt seamlessly to background workloads, and hence to dynamically varying availability of compute resources. As a result, analytics can be achieved on cheaper, less powerful devices. The system even achieves a slightly higher accuracy. Technically, we have three iterations of this work. 1. Transprecise Object Detection (TOD): We established that bounding box size is a primary predictor for which object detection neural network to select out of a choice of 4 YOLOv4 models. Large bounding boxes admit high accuracy with small networks or networks with high amounts of downsampling. This is not surprising in itself, however, this observation allowed us to design a selection technique that operates strictly on bounding box sizes. TOD demonstrated much improved speed, reduced energy consumption and slightly improved detection accuracy. 2. Runtime object detection to maximise accuracy (ROMA): It is important for real-time video analyics to be able to process video frames at the rate in which they arrive. A detector that is too slow will require dropping frames to keep up with the arrival of new frames. For the dropped frames, we assume that the detections of the previous frame (bounding box, detection type) are copied over (alternatively, it is possible to make predictions on movement of objects by adding object tracking but we did not explore this). As complex networks require more time to process, they incur more dropped frames, which counterbalances their improved accuracy. ROME designed an intricate performance (accuracy) model for object tracking that factors in bounding box size, relative detection accuracy of different networks, and object movement and decay rate to determine the best detector to use for any frame. Contrary to TOD, which is tuned to specific hardware, we have shown that ROMA is hardware agnostic and can adapt the detector accurately also in the presence of background workloads. 3. We started to explore an extension of ROMA that aims to co-schedule multiple object tracking applications on the same hardware, which would be useful for large-scale CCTV deployments such as in airports. The purpose of this extension was to weigh of the relative benefit of using complex detection networks for different video streams, applying them only where they make most impact. We have encouraging initial results, however, we have not found time and resources to complete this work yet since the end of the DiPET grant. In addition to the object detection work, we have explored resource-efficient federated learning with a partner, Polytechnic University of Catalunya (UPC), Spain. Our contribution to this space is to design a system for efficient sharing of model updates based on reduced precision (presenting non-integral numbers with few digits of precision) in order to save time and improve energy consumption. We have characterised the circumstances under which reduced precision is beneficial and elucidated the conditions of the machine learning model that enable this. Another piece of work associated to this grant is to design reduced precision number formats for graph analytics applications, such as Google's PageRank algorithm, belief propagation and the shortest-path problem. We have designed the number formats which are used to store data in memory, as well as efficient transformation schemes to transform the number format to a format supported by the hardware, such that we can leverage native hardware for efficient computations. We have shown substantial speedups of this technique, for instance achieving 80% speedup with a mixed-precision PageRank algorithm that uses a mixture of FP64, FP32 and a custom 16-bit format. |
| Exploitation Route | Any company or individual developing applications for environments with constrained computations, such as distributed edge computing, internet of things, mobile devices and autonomous vehicles, may benefit from these findings. We have a follow-on funding that was not findable in the relevant section, a research project funded through the US-Ireland tri-partite scheme between the National Science Foundation (NFS, USA), Science Foundation Ireland (SFI, Ireland) and the Department for the Economy (DfE, Northern Ireland), on "SWEET: Hardware and Software Sustainable Wearable Edge InTelligence", NSF site: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2315851 In the SWEET project, we continue to investigate the application of transprecision in AI model inference in edge computing environments. We are partnering with various industry partners, including AMD/Xilinx, Amazon and B-Secur. B-Secur specialises in health monitoring and we will explore relevance of our technology to their applications throughout this project. |
| Sectors | Digital/Communication/Information Technologies (including Software) |
| Description | (SoftNum) - Software-Defined Number Formats: Bridging the Gap between Performance, Accuracy, and Security |
| Amount | € 224,934 (EUR) |
| Funding ID | 101031148 |
| Organisation | European Commission |
| Sector | Public |
| Country | Belgium |
| Start | 08/2022 |
| End | 08/2024 |
| Description | Asynchronous Scientific Continuous Computations Exploiting Disaggregation (ASCCED) |
| Amount | £202,212 (GBP) |
| Funding ID | EP/X01794X/1 |
| Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
| Sector | Public |
| Country | United Kingdom |
| Start | 03/2023 |
| End | 06/2024 |
| Description | RAPID: ReAl-time Process ModellIng and Diagnostics: Powering Digital Factories |
| Amount | £403,636 (GBP) |
| Funding ID | EP/V02860X/1 |
| Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
| Sector | Public |
| Country | United Kingdom |
| Start | 05/2022 |
| End | 05/2025 |
| Description | SWEET: Hardware and Software for Sustainable Wearable Edge Intelligence |
| Amount | £299,750 (GBP) |
| Funding ID | USI-226 |
| Organisation | Department for the Economy, Northern Ireland |
| Sector | Public |
| Country | United Kingdom |
| Start | 01/2024 |
| End | 09/2028 |
| Title | Dataset for Anomaly Detection in a Production Wireless Mesh Community Network |
| Description | CSV dataset generated gathering data from a production wireless mesh community network. Data is gathered every 5 minutes during the interval 2021-04-13 00:00:00 to 2021-04-16 00:00:00. During the interval 2021-04-14 01:55:00 2021-04-14 18:10:00 there is the failure of a gateway in the mesh (node 24). |
| Type Of Material | Database/Collection of data |
| Year Produced | 2022 |
| Provided To Others? | Yes |
| Impact | None yet. |
| URL | https://zenodo.org/record/6169917#.Yh-rQy-l1pS |
| Description | Real-time video analytics for animal welfare monitoring - FlockFocus 2.B |
| Organisation | Queen's University Belfast |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | We are a partner in the FlockFocus 2.B project which is a multi-disciplinary collaboration between biological sciences and computer science, in particular video analytics (Centre for Secure Information Technologies or CSIT) and our high-performance computing group. The project is sponsored by the Foundation for Agriculture Research and McDonalds. Our role in the project is to enable execution of the full video analytics pipeline developed in previous iterations of the FlockFocus project, consisting of object detection, tracking and behavioural analysis, in an edge computing environment. Our task is also to determine what a reasonable hardware description for that environment should be. |
| Collaborator Contribution | The partner in biological sciences provides insight on the required analytics and their accuracy. They also liaise with farms where cameras are set up to collect footage to use in the project. The partner in CSIT develops the video analytics pipeline. |
| Impact | This project started in January 2025 and no reportable outcomes are available yet. It is a multi-disciplinary collaboration between computer science (video analytics, edge computing) and biological sciences (animal welfare). |
| Start Year | 2025 |
| Description | Real-time video analytics on edge devices for animal welfare monitoring |
| Organisation | Queen's University Belfast |
| Department | Centre for Secure Information Technologies (CSIT) |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | We are investigating the application of the transprecise object detection (TOD) step in video analytics that is being developed in DiPET to the use case of monitoring animal welfare using a video analytics pipeline. The pipeline consists of object detection, object tracking and behavioural analysis. The pipeline is to run in real-time on an edge computing environment or a smart camera. |
| Collaborator Contribution | The Centre for Secure information Technology are partners the FlockFocus project, sponsored by The Foundation for Food and Agriculture Research (FFAR) and McDonald's (https://www.qub.ac.uk/ecit/News/Queensacademicreceivesfundingtoenhancefarmedchickenwelfareresearch.html). The project aims to develop a vision-based system that leverages novel crowd analysis research and applies it to the tracking and behavioural analysis of a flock of chickens. This will enable researchers to monitor large numbers of birds, track their activity patterns and gather welfare indicators such as gait, feather cleanliness and incidents of play behaviour. CSIT is developing the object tracking and behavioural analysis techniques. |
| Impact | We have aided colleagues with applying the techniques developed in the DiPET project to improve object tracking in the use case of animal welfare monitoring. The outcome is a more efficient version of the object tracking algorithm that they previously developed. |
| Start Year | 2021 |
| Title | Real-time transprecise object detection in constrained computing environments |
| Description | Object detection in video stream is the act of identifying specific objects and describing them by their coordinates in a video frame, typically a bounding box. A transprecise object detector uses a variety of object detection techniques with varying accuracy of identification and varying amounts of computation. We have developed a transprecise object detector that uses size of bounding boxes in the frame to predict which object detection algorithm can provide sufficient accuracy with minimal computation and thus minimal energy consumption. When algorithms require too much computation, it will not be possible to complete them by the time the next frame arrives (soft real-time deadline) and frames need to be dropped. This results in reduced identification accuracy. Our transprecise object detector varies the underlying algorithm from frame to frame in order to meet deadlines as much as possible while maintaining sufficient accuracy (contradictory constraints). Over a video stream of several minutes, our method achieves higher accuracy than any of the underlying algorithms in isolation. The software is not available yet, but will be made available when sufficiently mature. |
| Type Of Technology | Software |
| Year Produced | 2021 |
| Impact | The software has not been shared publicly yet. We are looking into licensing the technology. We also have an updated version called ROMA which has additional features. |
| Description | Invited Talk - HiPEAC Computing Systems Week - Lyon |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Other audiences |
| Results and Impact | Presenting ideas on managing services in an edge computing or IoT context to an audience of academics and industry practitioners. |
| Year(s) Of Engagement Activity | 2021 |
| URL | https://www.hipeac.net/csw/2021/lyon/#/program/ |
