Realistic fault modelling to enable optimization of low power IoT and Cognitive fault-tolerant computing systems

Lead Research Organisation: University of Glasgow
Department Name: School of Engineering

Abstract

For future ICT industry, the elephant in the room is Internet of Things (IoT) and Artificial Intelligence (AI). They are driving the fourth industrial revolution that is profoundly changing how we live and interact. The main issues for IoT and AI have been identified as: power, security, and cost. This project is co-created with the industrial partners and focuses on the power issue.

One of the most effective way for reducing power is by lowering the operation voltage, Vg, towards the transistor threshold voltage, Vth. This has motivated recently extensive research in near threshold voltage computing. As Vg approaches Vth, the operation window (Vg-Vth) reduces and the system will be increasingly vulnerable to instability in Vth: a small rise in Vth can effectively switch off a transistor. Instability causes faults in operation, such as read and write errors in SRAM and digital timing errors. It is a limiting factor for how low (Vg-Vth) and, in turn, how much power consumption can be reduced.

One of the critical tasks for low power system optimization is to minimise operation voltage and power consumption that will deliver specified yield 'Y' in 'X' years at a temperature below 'T'. To complete this optimization, designers need a fault analysis model that gives the time evolution of the probability distribution of Vth and driving current, Id, at a given distance from their target values. The further Vth and Id depart from their target values, the more likely a circuit will fail.

Despite of decades of research, a reliable fault model is still not available. Indeed, in a recent review, the lack of realistic fault model tops the list of challenges for Cognitive Computing System design. Although the need for this model is clear, even world-leading EDA suppliers and foundries cannot deliver the model and current SPICE models simply do not include Jitter. This is related to weaknesses of previous research, including statistically inconsistent bottom-up methodology, limited time window, weak model verification criterion, and the neglect of the interaction of different instability sources.

The fabless UK IC-design companies are using foundries for their chip fabrication. Software is the essential bridge between designers and foundries. As there are no generally accepted realistic fault models at present, designers have to rely on adding a guard-band (design margins) obtained from empirical 'worst case guess'. This contributes to the substantial discrepancy between design and Si performance. As CMOS nodes are downscaled to nano-meter range, the stochastic spreading of device parameters increases dramatically this discrepancy, which has been identified as a major challenge for optimizing the design of low power IoT and Cognitive Computing Systems.

The aim of this project is to provide the world first test-proven fault model that enables statistical, dynamic, and quantitative analysis of fault rate and in turn the optimization of low power IoT and Cognitive Computing Systems. Novel techniques and methodologies will be employed to overcome the weakness of early works, including a top-down approach to remove device selection, advanced data acquisition method for long time window, qualifying the model by prediction capability, covering the interactions between different sources of instabilities. The developed model will be tested against Si performance of real circuits together with the industrial project partners. If successful, it will deliver a paradigm shift from one-size-fit-all to application specific fault analysis and optimization, reducing power and time-to-market.

Planned Impact

The impact of this project will be delivered to three groups: public in society, microelectronic industry, and project partners and the project's impacts on each of them are given below.

The industrial project partners: The microelectronics industry in UK is dominated by the IP-driven design houses, with ARM as the world's largest provider of semiconductor IP. Since they are fabless, they rely heavily on simulation to verify and optimize their design. This proposal aims at developing the world first test-proven fault model that enables statistical, dynamic, and quantitative analysis of fault rate and in turn the optimization of low power IoT and Cognitive Computing Systems.

In addition of taking steering advice from the partners, it has been agreed that the team will work on the test samples supplied by the partner, so that the results and knowledge gained from the project will be of direct interests to the partners. The developed models and technology will be implemented in the project partners EDA tools. It is planned to test the simulation accuracy and success together with the partners on their real design and circuits. To strengthen the impact, it has been agreed that some progress meetings will be held at partners' organizations.

The broad microelectronic industry: The research outcome not involving IP-right will be disseminated to a wide range of companies, mainly through The Logic Devices Industrial Consortium at IMEC and the global customer base of Synopsys. Since IMEC is a partner of this project, the research outcome will be disseminated to the consortium through progress meetings. The customers of Synopsys should be particularly interested in the outcome of this project, as they already use Synopsys simulators.

The dissemination to the national microelectronics community will be through IET and Sensor City initiatives including also numerous members of the National Microelectronics Institute (NMI) and facilitated by Innovate UK. As IoT develops rapidly, UK's national interest in it grows and IET, NMI and Innovate UK are organizing various events and workshops in this area. The team will contribute to these events and workshop by reporting the findings of this project. The dissemination to the international community will be through high impact publications, networks, and committees.

Impact on public in society: In the DTI's 'Electronics 2015' report authored by the 'Electronics innovation and growth team', the Secretary of State for Trade and Industry states 'Electronics today tends to be invisible but in fact is all pervasive, and is commonly an enabler in most sectors - from retail to defence and healthcare.' The IoT and AI will impact our daily life profoundly and microelectronics will play an enabling role in the IoT and AI. By providing a simulation and verification tool for optimising IoT and AI devices and circuits, this project will impact the IoT and AI sector and in turn the public in society.

Publications

10 25 50