11.8 Special Session: Self-aware, biologically-inspired adaptive hardware systems for ultimate dependability and longevity

Session Start
Thu, 14:00
Session End
Thu, 15:30

State-of-the-art electronic design allows the integration of complex electronic systems comprising thousands of high-level functions on a single chip. This has become possible and feasible because of the combination of atomic-scale semiconductor technology allowing VLSI of billions of transistors, and EDA tools that can handle their useful application and integration by following strictly hierarchical design methodology. This results in many layers of abstraction within a system that makes it implementable, verifiable and, ultimately, explainable. However, while many layers of abstraction maximise the likelihood of a system to function correctly, this can prevent a design from making full use of the capabilities of current technology. Making systems brittle at a time where NoC- and SoC-based implementations are the only way to increase compute capabilities as clock speed limits are reached, devices are affected by variability and ageing, and heat-dissipation limits impose "dark silicon" constraints. Design challenges of electronic systems are no longer driven by making designs smaller but by creating systems that are ultra-low power, resilient and autonomous in their adaptation to anomalies including faults, timing violations and performance degradation. This gives rise to the idea of self-aware hardware, capable of adaptive behaviours or features taking inspiration from, e.g., biological systems, learning algorithms, factory processes. The challenge is to adopt and implement these concepts while achieving a "next- generation" kind of electronic system which is considered at least as useful and trustworthy as its "classical" counterpart—plus additional essential features for future system design and operation. The goal of this Special Session is to present research from world-leading experts addressing state-of-the-art techniques and devices demonstrating the efficacy of concepts of self-awareness, adaptivity and bio-inspiration in the context of real-world hardware systems and applications with a focus on autonomous resource management at runtime, robustness and performance, and new computing architecture in embedded hardware systems."

Time Label Presentation Title
Authors
14:00 11.8.1 EMBEDDED SOCIAL INSECT-INSPIRED INTELLIGENCE NETWORKS FOR SYSTEM-LEVEL RUNTIME MANAGEMENT
Speaker:
Matthew R. P. Rowlings, University of York, GB
Authors:
Matthew Rowlings, Andy Tyrrell and Martin Albrecht Trefzer, University of York, GB
Abstract
Large-scale distributed computing architectures such as, e.g. systems on chip or many-core devices, offer ad- vantages over monolithic or centralised single-core systems in terms of speed, power/thermal performance and fault tolerance. However, these are not implicit properties of such systems and runtime management at software or hardware level is required to unlock these features. Biological systems naturally present such properties and are also adaptive and scalable. To consider how these can be similarly achieved in hardware may be beneficial. We present Social Insect behaviours as a suitable model for enabling autonomous runtime management (RTM) in many-core architectures. The emergent properties sought to establish are self-organisation of task mapping and system- level fault tolerance. For example, large social insect colonies accomplish a wide range of tasks to build and maintain the colony. Many thousands of individuals, each possessing relatively little intelligence, contribute without any centralised control. Hence, it would seem that social insects have evolved a scalable approach to task allocation, load balancing and robustness that can be applied to large many-core computing systems. Based on this, a self-optimising and adaptive, yet fundamentally scalable, design approach for many-core systems based on the emergent behaviours of social-insect colonies are developed. Experiments capture decision-making processes of each colony member to exhibit such high-level behaviours and embed these decision engines within the routers of the many-core system.
14:20 11.8.2 OPTIMISING RESOURCE MANAGEMENT FOR EMBEDDED MACHINE LEARNING
Authors:
Lei Xun, Long Tran-Thanh, Bashir Al-Hashimi and Geoff Merrett, University of Southampton, GB
14:40 11.8.3 EMERGENT CONTROL OF MPSOC OPERATION BY A HIERARCHICAL SUPERVISOR / REINFORCEMENT LEARNING APPROACH
Speaker:
Florian Maurer, TUM, DE
Authors:
Florian Maurer1, Andreas Herkersdorf1, Bryan Donyanavard2, Amir M. Rahmani2 and Nikil Dutt3
1TUM, DE; 2University of California, Irvine, US; 3University of California, US
Abstract
MPSoCs increasingly depend on adaptive resource management strategies at runtime for efficient utilization of resources when executing complex application workloads. In particular, conflicting demands for adequate computation perfor- mance and power-/energy-efficiency constraints make desired ap- plication goals hard to achieve. We present a hierarchical, cross- layer hardware/software resource manager capable of adapting to changing workloads and system dynamics with zero initial knowledge. The manager uses rule-based reinforcement learning classifier tables (LCTs) with an archive-based backup policy as leaf controllers. The LCTs directly manipulate and enforce MPSoC building block operation parameters in order to explore and optimize potentially conflicting system requirements (e.g., meeting a performance target while staying within the power constraint). A supervisor translates system requirements and application goals into per-LCT objective functions (e.g., core instructions-per-second (IPS). Thus, the supervisor manages the possibly emergent behavior of the low-level LCT controllers in response to 1) switching between operation strategies (e.g., maximize performance vs. minimize power; and 2) changing application requirements. This hierarchical manager leverages the dual benefits of a software supervisor (enabling flexibility), together with hardware learners (allowing quick and efficient optimization). Experiments on an FPGA prototype confirmed the ability of our approach to identify optimized MPSoC oper- ation parameters at runtime while strictly obeying given power constraints.
15:00 11.8.4 ASTROBYTE: A MULTI-FPGA ARCHITECTURE FOR ACCELERATED SIMULATIONS OF SPIKING ASTROCYTE NEURAL NETWORKS
Speaker:
Shvan Karim, Ulster University, GB
Authors:
Shvan Haji Karim, Jim Harkin, McDaid Liam, Gardiner Bryan and Junxiu Liu, Ulster University, GB
Abstract
Spiking astrocyte neural networks (SANN) are a new computational paradigm that exhibit enhanced self-adapting and reliability properties. The inclusion of astrocyte behaviour increases the computational load and critically the number of connections, where each astrocyte typically communicates with up to 9 neurons (and their associated synapses) with feedback pathways from each neuron to the astrocyte. Each astrocyte cell also communicates with its neighbouring cell resulting in a significant interconnect density. The substantial level of parallelisms in SANNs lends itself to acceleration in hardware, however, the challenge in accelerating simulations of SANNs firmly resides in scalable interconnect and the ability to inject and retrieve data from the hardware. This paper presents a novel multi-FPGA acceleration architecture, AstroByte, for the speedup of SANNs. AstroByte explores Networks-on-Chip (NoC) routing mechanisms to address the challenge of communicating both spike event (neuron data) and numeric (astrocyte data) across significant interconnect pathways between astrocytes and neurons. AstroByte also exploits the NoC interconnect to inject data and retrieve runtime data from the accelerated SANN simulations. Results show that AstroByte can simulate SANN applications with speedup factors of between x162 -x188 over Matlab equivalent simulations.
15:30   End of session