The Internet of Silicon Retinas (IoSiRe): Machine to machine communications for neuromorphic vision sensing data

Lead Research Organisation: University College London
Department Name: Electronic and Electrical Engineering


This proposal starts with the notion that, when considering future visual sensing technologies for next-generation Internet-of-Things surveillance, drone technology, and robotics, it is quickly becoming evident that sampling and processing raw pixels is going to be extremely inefficient in terms of energy consumption and reaction times. After all, the most efficient visual computing systems we know, i.e., biological vision and perception in mammals, do not use pixels and frame-based sampling. Therefore, IOSIRE argues that we need to explore the feasibility of advanced machine-to-machine (M2M) communications systems that directly capture, compress and transmit neuromorphically-sampled visual information to cloud computing services in order to produce content classification or retrieval results with extremely low power and low latency.

IOSIRE aims to build on recently-devised hardware for neuromorphic sensing, a.k.a. dynamic vision sensors (DVS) or silicon retinas. Unlike conventional global-shutter (frame) based sensors, DVS cameras capture the on/off triggering corresponding to changes of reflectance in the observed scene. Remarkably, DVS cameras achieve this with (i) 10-fold reduction in power consumption (10-20 mW of power consumption instead of hundreds of milliwatts) and (ii) 100-fold increase in speed (e.g., when the events are rendered as video frames, 700-2000 frames per second can be achieved).

In more detail, the IOSIRE project proposes a fundamentally new paradigm where the DVS sensing and processing produces a layered representation that can be used locally to derive actionable responses via edge processing, but select parts can also be transmitted to a server in the cloud in order to derive advanced analytics and services. The classes of services considered by IOSIRE require a scalable and hierarchical representation for multipurpose usage of DVS data, rather than a fixed representation suitable for an individual application (such as motion analysis or object detection). Indeed, this is the radical difference of IOSIRE from existing DVS approaches: instead of constraining applications to on-board processing, we propose layered data representations and adaptive M2M transmission frameworks for DVS data representations, which are mapped to each application's quality metrics, response times, and energy consumption limits, and will enable a wide range of services by selectively offloading the data to the cloud. The targeted breakthrough by IOSIRE is to provide a framework with extreme scalability: in comparison to conventional designs for visual data processing and transmission over M2M networks, and under comparable reconstruction, recognition or retrieval accuracy in applications, up to 100-fold decrease in energy consumption (and associated delay in transmission/reaction time) will be pursued. Such ground-breaking boosting of performance will be pursued via proof-of-concept designs and will influence the design of future commercial systems.


10 25 50

publication icon
Anarado I (2017) Mitigating Silent Data Corruptions in Integer Matrix Products: Toward Reliable Multimedia Computing on Unreliable Hardware in IEEE Transactions on Circuits and Systems for Video Technology

publication icon
Bi Y (2020) Graph-based Spatio-Temporal Feature Learning for Neuromorphic Vision Sensing. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

publication icon
Chadha A (2019) Video Classification With CNNs: Using the Codec as a Spatio-Temporal Activity Sensor in IEEE Transactions on Circuits and Systems for Video Technology

publication icon
Chadha A (2019) Improved Techniques for Adversarial Discriminative Domain Adaptation. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

publication icon
Jubran M (2020) Rate-Accuracy Trade-Off in Video Classification With Deep Convolutional Neural Networks in IEEE Transactions on Circuits and Systems for Video Technology

Description Unlike existing projects in IoT visual sensing that enhance compression or transmission aspects for conventional video data, IOSIRE took a foundational approach by reconsidering the essentials of visual sensing and working out new representation, compaction and transmission schemes based on recently-developed dynamic vision sensing technologies. This complements existing work by other academic and industrial groups in the UK and worldwide in IoT, sensing and low-power systems.

We have now begun to derived actual results showing that, under the same precision (e.g., confusion matrix values for classification or mean absolute precision (mAP) in visual search), we obtain 10 to 100-fold reduction of energy consumption and delay in comparison to conventional video streaming, thereby allowing for applications that would be impossible with conventional video-based IoT systems. As written in our proposal, this stems from the aggregate of: (i) up to 10-fold reduction in sensing power from DVS vs. conventional cameras; (ii) up to 5-fold reduction in bandwidth due to T1.2 (in comparison to video coding); (iii) 2 to 5-fold reduction of redundant transmissions due to T2.2, T2.3, T3.1 and T3.2 (fountain coding, adaptive modulation and FD capabilities, and combining their advantages via adaptive NET-application co-learning); (iv) up to 3-fold reduction in traffic volume sent to the cloud via the adaptive edge processing of WP3.
Exploitation Route Possibilities of inclusion of the practical outcomes of this research in future chip designs of MediaTek and iniLabs will be assessed, along with commercial licensing and further collaboration possibilities. Thales can also assess the feasibility of including DVS and M2M communications in their roadmap for visual surveillance and monitoring systems. It is also expected that our DVS-IoT system simulations/emulations may be deployed in advancing the LinkIt development platform and EDA design automation software, which are offered
to this project, respectively by MediaTek and Keysight. Beyond such activities, by open-sourcing the derived software for artificial generation of DVS triggering events based on conventional frame-based video footage (WP1), large video datasets that have been manually annotated for classification and retrieval tests can be converted into DVS streams. This will substantially facilitate the training of deep neural networks for different contexts, which can become a catalyst for further R&D work in the field.
Sectors Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software)

Amount £30,878 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 08/2017 
End 08/2021
Description Enabling Visual IoT Applications with Advanced Network Coding Algorithms
Amount € 195,454 (EUR)
Funding ID 750254 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 01/2018 
End 01/2020
Description Leverhulme Trust Senior Research Fellowship
Amount £51,380 (GBP)
Funding ID LTSRF1617/13/28 
Organisation The Royal Society 
Department Royal Society Leverhulme Trust Senior Research Fellowship
Sector Charity/Non Profit
Country United Kingdom
Start 09/2017 
End 09/2018