Dynamic Adaptation in Heterogeneous Multicore Embedded Processors
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Informatics
Abstract
The overall objective of this project is to investigate new and novel methods of automating the design, of both the hardware and software, of embedded systems to enable the timely creation of future generations of high-performance low-power digital appliances. This is a vertically-integrated project, which brings together research in compilers, architectures, signal processing, and an economically-important emerging application area.Embedded processors are an integral part of our everyday lives; from smart phones and flash memory sticks, to wireless communications, automotive computing, bio-medical devices, and many more. Future embedded processors will require significantly higher performance than the processors we have today. However, this must be achieved whilst also increasing their energy efficiency, as such systems are increasingly used in mobile or battery-operated devices.Performance cannot be increased simply by clocking devices at a higher frequency, as this significantly reduces energy efficiency.Previous research has shown that customizing a processor according to its application can provide a significant performance boost whilst simultaneously reducing energy consumption. Similarly, the use of multi-core processors, which can be specialized in heterogeneous ways, offers additional performance in a more energy-efficient way than can be achieved simply by the homogeneous replication of a fixed processor.The first challenge with application-specific processors, which is compounded in heterogeneous multi-core systems, is the vast array of possible designs from which to choose. This increasing complexity of the design space of computer systems, coupled with the drive for lower energy consumption, means that manual approaches to design are no longer feasible. Instead, by automating the process of searching the design space, it becomes possible to find the best designs. However, this approach is computationally intractable, due to the sheer number of designs that must be considered. There is now strong evidence, from our prior work and from others, that machine learning can provide a fast track to design-space exploration in both processor design and compiler design.The second challenge addressed by this project is variability in behaviour. For example, a broadband modem may wish to adapt its behaviour to the environmental conditions affecting signal quality. At the silicon level, factors such as temperature, process variation and operating voltage will affect the performance and energy consumption of the device. Devices that are able to adapt their hardware and software behaviour to meet these changing circumstances will not be constrained by worst-case analysis at design time, but will be able to tune their behaviour dynamically to meet actual real-time constraints. It is widely accepted that variability is a growing concern that requires a new approach. This project examines how dynamic adaptation in software and hardware can solve this problem. This will involve a combination of just-in-time compilation, to create more dynamic software, as well as just-in-time instruction set re-synthesis, to create dynamic processors.A key aspect of this project is the synergy between new design methods and an emerging application; in this case LED-based Visible Light Communication (VLC). The use of LED lighting is growing rapidly, due its low energy consumption. LED light can also be modulated to carry a digital payload at speeds even higher than 100 Mbps. However, this presents a major computational challenge, which we aim to address using the dynamically adaptable customized multi-core processors and compilers outlined above. We aim to extend the use of machine learning from off-line (i.e. performed at design time) to on-line (i.e. performed during system operation). Designs will be fabricated in silicon to demonstrate the impact of our research, and to enable real-time experimentation.
Planned Impact
This project will have two main areas of impact: (1) in the design of next-generation embedded systems, and (2) in the realization of system-on-chip solutions for free-space optical communications. Our work in system synthesis we will enable the optimization of systems that would previously have been considered impractical. If successful, the project will also provide new design tools that will reduce NRE costs for embedded systems, and open up new application areas. The beneficiaries of this will be the embedded systems industry, from processor IP companies and compiler vendors, to fabless semiconductor companies and system integrators who build electronic devices for use in consumer, automotive, medical, telecommunications and energy industries. By applying dynamic adaptation to embedded systems we hope to solve the fundamental problem of how to cope with on-chip variation at silicon technologies below 65nm. At present this is an unsolved problem, requiring manufacturers to design-in expensive performance margins. Potential beneficiaries will be any future user of a battery-operated device, who will see better performance and longer battery life. The worldwide economic and environmental benefit of a switch to LED lighting is huge, and will drive the update of LED lighting. If this project is able to deliver a system-on-chip solution capable of high bandwidth digital communication through LED-based visible light this would have a far-reaching impact. Such LED light fittings would operate as both sources of low energy lighting and optical wireless access points. This would have a huge impact across a wide range of end-user products spanning the domestic, business, medical, and transportation domains. If the availability of VLC stimulates LED lighting uptake, then a potential future environmental benefit could be a significant reduction in CO2 emissions. This project will also extend and help to sustain the UK skill base in high-performance processor design and nanometre-scale silicon implementation. This project will benefit the UK skills base by training new doctoral students in these highly-specialized skills. The project contains a 10-point plan for maximizing the impact of the research: 1. We shall build demonstrator systems capable of showcasing the theories and algorithms underpinning our work. 2. In the final year of the project we will organise an Innovation Workshop, in order to disseminate our research results to UK and European SMEs. 3. We will engage with potential industrial beneficiaries, to share technologies for research purposes during the project. 4. We will leverage our recent experience in technology licensing to ensure that new technologies emerging from our research are transfered to industry. 5. The formation of a spin-out company will also be considered as a route to industrial exploitation of the VLC demonstrator. 6. We will continue to use online media to communicate to the academic community via http://groups.inf.ed.ac.uk/pasta/ 7. Postgraduate skill sets will be enhanced through training in nanometre-scale chip design. 8. To maximize the academic impact we shall publish our research in the most respected journals and the top conferences in the area. 9. Elements of the demonstrator platform will be offered to academic collaborators to stimulate exchange of ideas. 10. To maximize the academic benefit of our work we shall promote bilateral meetings between our group and others. The timescales for realising these benefits range from 1 to 5 years. The technology demonstrators will act as proof-of-concept, reducing the time needed to mature the ideas before further exploitation to a year or two. The economic benefits of LED-based communications will be realised when uptake grows, which is impossible to predict. However, these devices are driven by Moore's law, suggesting rapid adoption and potential for widespread use within a few years.
Organisations
Publications
Almer O
(2011)
Architecture of Computing Systems - ARCS 2011
Almer O
(2012)
A Parallel Dynamic Binary Translator for Efficient Multi-Core Simulation
in International Journal of Parallel Programming
Edler Von Koch T
(2013)
Limits of region-based dynamic binary parallelization
Edler Von Koch T
(2013)
Limits of region-based dynamic binary parallelization
in ACM SIGPLAN Notices
Franke B
(2012)
Statistical Performance Modeling in Functional Instruction Set Simulators
in ACM Transactions on Embedded Computing Systems
Ghimire B
(2012)
Self-organising interference coordination in optical wireless networks
in EURASIP Journal on Wireless Communications and Networking
Hanzo L
(2012)
Wireless Myths, Realities, and Futures: From 3G/4G to Optical and Quantum Wireless
in Proceedings of the IEEE
Murray A
(2012)
Compiling for automatically generated instruction set extensions
Nigel Topham (Author)
(2012)
Novel Unipolar Orthogonal Frequency Division Multiplexing (U-OFDM) for Optical Wireless
Nigel Topham (Author)
(2011)
Enhanced Subcarrier Index Modulation (SIM) OFDM
Nigel Topham (Author)
(2012)
Optimal Power Allocation in Spatial Modulation OFDM for Visible Light Communications
Popoola W
(2013)
Error Performance of Generalised Space Shift Keying for Indoor Visible Light Communications
in IEEE Transactions on Communications
Popoola W
(2012)
Generalised space shift keying for visible light communications
Spink T
(2014)
Efficient code generation in a region-based dynamic binary translator
in ACM SIGPLAN Notices
Sundararajan K
(2012)
The Smart Cache: An Energy-Efficient Cache Architecture Through Dynamic Adaptation
in International Journal of Parallel Programming
Sundararajan K
(2011)
Smart cache: A self adaptive cache architecture for energy efficiency
Sundararajan K
(2012)
Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs
Sundararajan K
(2011)
A reconfigurable cache architecture for energy efficiency
Description | This project achieved all four main objectives. These were: (1) to explore the hardware/software design of dynamically-adaptable embedded processors; (2) to advance our knowledge of DSP algorithms and physical interfaces required to implement visible light communication (VLC); (3) to explore the synergy between DSP algorithms and VLC within the first two objectives; and (4) to investigate the practical viability of our research results through the construction of demonstrator systems for VLC. Under objective (1) we investigated a range of topics from dynamic micro-architecture adaptation to hardware-software co-design of multi-core embedded systems. This resulted in the development of new techniques for dynamic cache adaptation and design space exploration. For example, we showed that our Smart Cache approach reduced the energy-delay product of last-level data caches by up to 70% for 2 core systems, and 12% for 4 core systems. We also developed a cooperative caching mechanism, yielding average dynamic and static energy savings of 35% and 25% respectively, compared to a fixed partitioning scheme. One of our aims was to explore the use of machine learning to design embedded systems for VLC. We developed a strategy, based on machine learning techniques, to model the impact of the parameterization of a custom instruction-set extension tool, on the target objectives, given the characteristics of the application. Our predictor was able to suggest a subset of parameters that are likely to lead to optimal hardware implementations. This method was evaluated on a resource sharing problem, which is typical in high-level synthesis, where the trade-offs between area and performance need to be explored. In a case study, we showed that the technique can reduce by two orders of magnitude the number of design points that need to be explored in order to find the Pareto optimal solutions. We also used a machine learning based approach in the context of on-chip interconnect design. We were able to demonstrate the feasibility of such models in predicting good network configurations, based on sample data from a single profiling run of an embedded multi-core application on a reference platform. Our results showed that it was possible to determine optimum network configurations up to 280 times faster than methods based on exhaustive search. Under objective (2) we developed novel VLC/LiFi transmission algorithms with the aims of minimising computational complexity, maximising energy efficiency and delivering high spectrum efficiency. These algorithms were implemented on the 32-core test chip developed within the project. One of the primary coding schemes for VLS is orthogonal frequency division multiplexing (OFDM). Under objective (3) we explored new ways to efficiently map OFDM computations for VLC onto a new and novel heterogeneous system-on-chip (SoC) architecture. This resulted in the development of a collection of energy efficient fixed-function cores, each targeting a specific phase of the OFDM algorithm. These were then integrated within a novel streaming data engine that operated in parallel with the processors of the SoC, to maximize throughput and overall energy efficiency. We developed extended versions of the RISC processors used in the design, in order to streamline the handling of Fast Fourier Transforms (FFT), one of the core computations within OFDM. We found that this heterogeneous approach, involving a combination of general-purpose RISC processors, specialized RISC processors, and fixed-function streaming data engines, leads to a VLC solution that is highly efficient in terms of both silicon die area and energy consumption. Under objective (4) we brought together the results from across the project to create a demonstration system for visible light communications. At the centre of this was a custom silicon chip, designed and implemented within the project. This was a 16 sq.mm chip implemented in a 65nm CMOS technology. It contained 32 embedded processor cores, 7 streaming data engines; four high-speed serial I/O links, some fast on-chip memory, and an AXI-based interconnect fabric. The integration of our research results within a silicon chip has allowed us to evaluate the cost/performance trade-offs with greater realism than would have been possible through simulation and modelling alone. Our most significant achievements were: (1). The creation of a 32-core 50-million transistor chip within an academic research project is a significant achievement. This demonstrated the capability of the research team to build large-scale silicon devices at comparatively low cost. (2). The development of multiple input multiple output (MIMO) algorithms for intensity modulation / direct detection based on space shift keying and spatial modulation (the latter was invented by the Co-I, and investigated in EPSRC grant (EP/G011788/1, Spatial Modulation) . (3). The development of novel energy efficient and spectrum efficient data encoding techniques for intensity modulation / direct detection systems. The new technique is referred to as enhanced unipolar orthogonal frequency division multiplexing (eU OFDM), and it doubles the data rate compared to state-of-the-art techniques. |
Exploitation Route | These findings have been taken forward in a proof-of-principle follow-on project, funded by Edinburgh Research and Innovation, to enable further development of the silicon implementation of the VLC chip that was carried out in this project. In the future, this and other findings will be taken forward through further collaborative research projects in emerging LiFi technologies. Our own, and other, research groups working on LiFi will benefit from the research publications on VLC-related topics from this project. The construction of realistic demonstration systems has acted as a catalyst for further research in this area, and continues to provide an on-going research facility for follow-on research projects in VLC at the University of Edinburgh. The University of Edinburgh has established the LiFi R&D Centre, in part to take forward the findings of this project, and has invested in excess of £2M in funding for an experimental officer, two PDRAs, two junior academics, Dr. Popoola, and Dr. Safari, and a senior business development executive. The two RAs are working to integrate the LiFi receiver chip, the LiFi transmitter chip (both developed in EP/K00042X/1, UP-VLC) and the PASTA-2 baseband chip (EP/I013539/1) into a single reference platform of low form factor. The target TRL (technology readiness level) of the reference platform is 6, and the goal is to license the reference platform to industry. Furthermore, the LiFi R&D Centre is currently engaging with various industry sectors to license the reference platform. |
Sectors | Aerospace Defence and Marine Digital/Communication/Information Technologies (including Software) Electronics Healthcare |
URL | http://groups.inf.ed.ac.uk/pasta/ |
Description | The eU OFDM invention has been patented, and licensed to a spin-out company, pureLiFi Ltd. The knowhow generated by producing a 32-core heterogeneous SoC design is being used to further the development of subsequent many-core chips, in academia and in the semiconductor industry, both in the UK and beyond. This project has helped to propel the LiFi concept, through the achievement of the objectives outlined in the key findings. In particular, economic impact has been created through the formation of a spin-out company pureLiFi Ltd. Many team members contributed to this project, and in doing so acquired exceptional skill sets. Some of these are now being deployed in UK industry, contributing to the economic and social well-being of the nation. The societal and economic impact from these advancements in LiFi will emerge over the next few years. Early signs of this include the recent announcement that future Apple phones will be LiFi capable, and that the IEEE is now beginning to standardise LiFi (IEEE 802.15.7). Society as a whole is beginning to become aware of LiFi, with a reported media reach of 1.8bn people (Nov, 2015, source Meltwater). Articles have appeared in the popular media, including BBC, CNN, TIME, The Times, el Pais, etc. Prof Haas has presented two TED Global talks on LiFi, July 2011 (2.3m views) and September 2015 (1.3m views). The investigators continue to work in this area, and will continue to build on this research results from this project to ensure that the impact from this work materializes as expected. |
First Year Of Impact | 2014 |
Sector | Electronics,Energy |
Impact Types | Societal Economic Policy & public services |