Power-Adaptive Computing: Run-Time Management Design on FPGA SoC Devices
Lead Research Organisation:
Newcastle University
Department Name: Sch of Engineering
Abstract
Traditional low-power design methodologies of embedded systems have focused on delivering performance-driven designs, where energy savings are achieved by relaxing execution time of tasks, while meeting the real-time deadlines. Conventional design strategies cannot longer ensure the computational certainty in emerging ubiquitous systems operating under varying levels of power, such as seen in ambient energy sources. Due to dynamic power behaviour, the stability and the computational progress of systems cannot be guaranteed, particularly when re-occurring power fluctuations or insufficient power budgets can terminate the completion of tasks and degrade the quality of applications. Therefore, this necessitates to design new runtime management solutions, which would allow to design versatile systems, where software and hardware handles are capable of effectively converting the available energy into computing power and guarantee task retention or completion under dynamic power profile.
Heterogeneous CPU-GPU based systems can achieve various performance and energy trade-offs, but when the energy is scarce, the same level of functionality can be achieved through highly customized hardware, such as FPGAs or ASICs. The increasing capabilities of high-level synthesis tools for logic design and shorter application development time makes FPGA SoCs appealing for custom designs. Offline power estimation tools allow to determine probabilistic system power consumption by simulating circuits. However, such approach cannot provide enough information about system's power behaviour at run-time. Complex designs require in-depth knowledge about sub-system level power usage. Therefore, provided with such information at high-level application layer and combined with adaptive run-time algorithms together with power management techniques, it would facilitate the design of energy-efficient and power-adaptive applications.
The key of this research is to establish an intelligent power-adaptive run-time management, which autonomously makes computing decisions in order to mitigate the computational uncertainties and warrant continuous functionality modulated by the incoming levels of power. For this purpose, the power-awareness feature will be introduced by analysing and modelling data obtained from a built-in hardware monitors, in order to formulate the power budgets. The decision to schedule computing tasks at run-time will be facilitated using machine learning algorithms coupled with power management techniques, such as voltage-frequency scaling. The research project is intended to develop a tool to support the design of power-adaptive systems, where the computing actions are governed by the run-time routines to modulate future computations within the energy or power constrain.
Heterogeneous CPU-GPU based systems can achieve various performance and energy trade-offs, but when the energy is scarce, the same level of functionality can be achieved through highly customized hardware, such as FPGAs or ASICs. The increasing capabilities of high-level synthesis tools for logic design and shorter application development time makes FPGA SoCs appealing for custom designs. Offline power estimation tools allow to determine probabilistic system power consumption by simulating circuits. However, such approach cannot provide enough information about system's power behaviour at run-time. Complex designs require in-depth knowledge about sub-system level power usage. Therefore, provided with such information at high-level application layer and combined with adaptive run-time algorithms together with power management techniques, it would facilitate the design of energy-efficient and power-adaptive applications.
The key of this research is to establish an intelligent power-adaptive run-time management, which autonomously makes computing decisions in order to mitigate the computational uncertainties and warrant continuous functionality modulated by the incoming levels of power. For this purpose, the power-awareness feature will be introduced by analysing and modelling data obtained from a built-in hardware monitors, in order to formulate the power budgets. The decision to schedule computing tasks at run-time will be facilitated using machine learning algorithms coupled with power management techniques, such as voltage-frequency scaling. The research project is intended to develop a tool to support the design of power-adaptive systems, where the computing actions are governed by the run-time routines to modulate future computations within the energy or power constrain.
Organisations
Publications
Description | Heterogeneous System-on-Chip (SoC) devices, consisting of multi-purpose processors have enabled to develop high-performance applications in domains including big data analytics, artificial intelligence or machine vision. Managing energy consumption in these devices are highly challenging due to architectural complexity and the dynamic workloads of applications. Continuous management of available hardware and software resources is required to guarantee the expected performance/ quality of service (QoS) while minimizing the energy consumption of a system. In this work, a programming model-based framework has been developed on Intel's FPGA SoC device. It enables to specify high-level annotations of power/energy/performance requirements in applications using a designed software interface. The framework incorporates monitoring and profiling features to model the system's characteristics through online machine learning. The underlying model of the system is coupled with a runtime control scheme, which finds an optimal system configuration in order to meet the needed performance while minimizing the energy consumption at runtime. In addition to the power-adaptive framework development for FPGA SoCs, the research efforts have shifted towards resource management of distributed many-core servers. An important aspect of servers is the need to scale and support the user traffic demands at a required quality of service (QoS) with minimal energy costs. Resource management of distributed systems is challenging due to complex decision space, dynamic environments of applications and often limited modelling data. While control decisions can be learnt at runtime, e.g., using model-free reinforcement learning, it typically requires extensive exploration time and could disrupt application/service performance during the learning phase. To provide scalability and low-cost continuous adaptation to workloads while minimizing energy consumption, a runtime controller has been developed using the principle of transfer Q-learning. The results show that the use of transfer learning allows to reduce learning requirements and achieve better QoS when compared to model-free reinforcement learning solutions. Also, the ability to transfer already learnt control decisions improves portability and operational optimality when the system scales-out. |
Exploitation Route | Established expertise in runtime management methods is applicable to a wide range of applications and systems, including low-power devices such wearable electronics, mobile devices or larger scale distributed many-core servers. For example, in order to deliver optimized computational capability on devices powered by minuscule batteries or energy harvesters with limited and variable power levels, the runtime management is required for controlling available software and hardware resources in order to adapt to varying power/energy levels. In the systems, where power is always available,e.g., servers, the key aim is to minimize the energy/power consumption to lower the operating costs, while delivering desired quality of service. |
Sectors | Electronics Energy |
URL | https://github.com/nclaes/pacl |
Description | Collaborative work and research presentation at Nokia Bell Labs |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Industry/Business |
Results and Impact | I've spent one month in Nokia Bell Labs (Cambridge) working with Pervasive systems research group (July, 2019). The work focused on power-adaptive runtime management for artificial intelligence applications running on low-power devices. During this time, a presentation about my research work at university was given to a research group in Nokia Bell Labs offices (Cambridge). Also the work established during my visit was presented, covering energy-efficient design of audio processing for audio event recognition applications on low-power micro-controllers. |
Year(s) Of Engagement Activity | 2019 |
Description | Design, Automation and Test in Europe (DATE 2020) Conference |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | I have attended a virtual Design, Automation and Test in Europe (DATE 2020) Conference. I have had a chance to listen to high-quality presentations in the field of adaptive computing, energy-constrained systems allowing me to learn about new research directions and interesting methodologies, which can be used in future research work. |
Year(s) Of Engagement Activity | 2020 |
URL | https://past.date-conference.com/proceedings-archive/2020/ |
Description | Design, Automation and Test in Europe (DATE 2021) Conference |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | I have attended a virtual Design, Automation and Test in Europe (DATE 2021) Conference. I have participated in a workshop on System-level Design Methods for Deep Learning on Heterogeneous Architectures (SLOHA 2021). I have published a research paper and presented it both live and in the form of a recorded presentation. This has allowed me to have useful discussions and receive detailed feedback about my work from the community. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www12.cs.fau.de/ws/sloha2021/ |
Description | Design, Automation and Test in Europe (DATE 2022) Conference |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Participation in a virtual Design, Automation and Test in Europe (DATE 2022) Conference taking place on 14 - 23 March 2022. The conference has involved attending presentations related to my research work and learning about the latest research directions in the field. I have published a paper on Runtime Energy Minimization of Distributed Many-Core Systems using Transfer Learning, which has been presented in a 20-min video format on the conference platform and at a live virtual event. |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.date-conference.com/ |