HOCOS - Current Research Projects
Cryptographic circuits are employed in mobile and embedded systems to protect sensitive information from unauthorized access and manipulation. Fault attacks circumvent the protection by injecting faults into the hardware implementation of the cryptographic function, thus manipulating the calculation in a controlled manner and allowing the attacker to derive protected data such as secret keys.
The Algebraic Fault Attacks project focuses on the class of algebraic fault attacks, where the information used for cryptanalysis is represented by systems of polynomials.
Benchmarks for algebraic fault attacks
We are working on creating a comprehensive set of benchmarks for algebraic fault attacks. These will be published here as soon as they are available.
Fault Attack Benchmarks for Small Scale AES
Memristive devices offer enormous advantages for non-volatile memories and neuromorphic computing, but there is a rising interest in using memristive technologies for security applications. Project MemCrypto aims at development and investigation of memristive cryptographic implementations, assessment and improvement of their security against physical attacks. This work focuses on combinational and sequential realizations of complete cryptographic circuits and complements earlier research on memristive physical unclonable functions and random number generators.
This project aims at developing methods to realize low-cost and power-efficient hardware circuits for near-sensor computing following the Stochastic Computing paradigm. Stochastic computing provides extremely compact, error-tolerant and low-power implementations of complex functions, but at the expense of longer computation times and some degree of inaccuracy. This makes stochastic circuits (SCs) especially attractive for near-sensor computing, where the processed sensor data are inaccurate anyway and computations tend to occur infrequently. A special focus of this project will be the SC realization of neural networks (NNs) used for classification tasks, from lightweight NNs to fully-fledged convolutional NNs for deep learning.
Test quality, defined as the absence of test escapes (defective circuits that had passed post-manufacturing test), is the ultimate target of testing. Customers apply system-level test (SLT) to circuits that already have been tested post-fabrication and reportedly identify test escapes. The objective of this project is to understand the nature of such hard-to-detect failures. Establishing a better understanding of SLT and making it more effective and efficient could drastically improve the economy of circuit design and manufacturing.
CA - Current Research Projects
Project Description
Computer systems have reached a point where significant improvements in computational performance and energy efficiency have become very hard to achieve. The main reason is a power and efficiency wall CMOS technology is facing. Physical limitations such as high power densities and a variety of reliability degradations now enforce larger design margins which reduce efficiency.
Approximate Computing trades off precision against power, energy, storage, bandwidth or performance, and can be applied to hardware, software and algorithms. It enables much more efficient computing by providing additional, adjustable design and runtime parameters to find Pareto optimal solutions. However, its application is still rather limited and a significant extension of the scope of applications is required, including applications that are not necessarily inherently error-tolerant.
The ACCROSS project will tackle this challenge with a cross-layer approach to analysis and optimization, which considers the system stack from the application down to the hardware. At the higher levels, ACCROSS covers the analysis of applications from different computational problem classes, which will act as enablers for mainstream approximate computing. This includes the development of new methods for the analysis of approximation potentials in applications, the adaptation of existing applications to approximation and the quantification of efficiency gains. Moreover, new methods for combining suitable approximation techniques at different system layers during runtime will be provided to maximize efficiency with respect to performance and energy. New error metrics and methods for lightweight runtime monitoring of accuracy will be developed to ensure the usefulness of the targeted applications. At the lower levels, ACCROSS covers the systematic evaluation of the impact of removing design margins which will lead to approximate behavior and improved efficiency. Abstract but accurate models linking the hardware and software will be provided, enabling designers to accurately quantify the error and efficiency impact of approximation across the system stack.
An important problem in modern technology nodes in nano-electronics are early life failures, which often cause recalls of shipped products and incur high costs. An important root cause of such failures are marginal circuit structures, which pass a conventional manufacturing test, but are not able to cope with the later workload and stress in the field. Such structures can be identified on the basis of non-functional indicators, in particular by testing the timing behavior. For an effective and cost-efficient test of these indicators, the FAST project investigates novel scan designs and built-in self-test strategies for circuits, which can operate at frequencies beyond the functional specification to detect small deviations of the nominal timing behavior and thus potential early life failures.
since 02.2017, DFG-Project: WU 245/19-1
The project in detail:
State-of-the-art nanoscale technologies allow for the integration of billions of transistors with feature sizes of 14 nm or below into a single chip. This enables innovative approaches and solutions in many application domains, but it also comes along with fundamental challenges. Early life failures are particularly critical, as they can cause product recalls associated with a loss of billions of dollars. A major cause of early life failures are "weak" devices that operate correctly during manufacturing test, but cannot stand operational stress in the field. While other failure mechanisms, such as aging or external disturbances, to some extent, may be compensated by a robust design, potential early life failures must be detected by tests, and the respective systems have to be sorted out. This requires specific approaches far beyond today’s state-of-the-art.
As they work properly in the beginning, weak structures must be identified by analyzing the non-functional circuit behavior with the help of appropriate observables. Besides power consumption, the circuit timing is one of the most important reliability indicators. In particular, small delay faults may indicate marginal hardware that can degrade further under stress. However, they can be “hidden” at nominal frequency and only be detected at higher frequencies (“faster-than-at-speed test” / FAST). Therefore, conventional approaches for testing reach their limitations, and new methods must be investigated and developed in the following three domains:
- Specific techniques for „design for test“ (DFT) must be developed to deal with the challenges of testing beyond nominal frequency.
- Strategies for test scheduling must ensure that a maximum fault coverage is achieved with a minimum number of test frequencies and a short test time.
- Appropriate metrics are needed to quantify the coverage of weak devices. Here it is particularly challenging to distinguish the behavior of week devices from variations due to nanoscale integration.
Since FAST imposes extreme requirements on the automatic test equipment (ATE), it is very important to support an efficient implementation as a built-in self-test (BIST).
Within the framework of the project, strategies and solutions will be developed for the problems mentioned above. This way, the enormous cost of a traditional „burn-in“ test can be reduced, thus enabling the introduction of nanoscale technology to new application domains.
since 08.2014, DFG-Project: WU 245/17-1, WU 245/17-2
Please visit our project page for detailed information.
Project Description (Phase 2)
RSNs were initially brought up to manage the extensive amount of instrumentation in modern systems-on-chip to facilitate cost-efficient bring-up and debug, test, diagnosis and maintenance. Recently, the reuse of RSNs at system runtime for online fault classification and fault management moved into the center of research activities. Reasons are not only the increased complexity and dependability requirements in new technologies, but also the emerging application paradigms of self-aware and autonomous systems. Especially in safety-critical applications, online test, system monitoring and fault tolerance at low cost become mandatory. For example, the standard ISO 26262 specifies critical faults to be detected within certain test intervals at runtime and allows only a maximum fault reaction time until the system has to be transferred into a safe state. The periodic test is usually structure oriented and targets stuck-at, transition and delay faults.
It is common practice that the required tasks for initializing the periodic testing, for fault detection and for fault reaction are executed by the system functionality transparent in the background. Disadvantages of this approach are manifold: The periodic test and test evaluation constitute some significant additional workload, reduce performance, consume a large amount of additional power, and may take too much time for avoiding dangerous situations. Guaranteeing deadlines and verifying fault tolerance is extremely difficult as the properties have to be proven in the presence of faults.An alternative is the use of the non-functional infrastructure for concurrent fault detection and fault management, and first approaches to employ RSNs in in-system runtime tests have already been proposed. These first attempts still require a dedicated regular structure of RSNs and its permanent background operation which should be avoided in practice.
The results of the first phase of ACCESS provide an excellent basis for further research on the runtime use of RSNs. Since RSNs are integrated into the chip any way, the required cost of the modifications for runtime use are affordable even for a mass market like automotive. The goal of the second phase of ACCESS is a technique for a robust online use of RSNs to support safety, fault tolerance and reliability management.
This comprises:
- In-system run-time test using RSNs
- System wide collection of diagnosis information
- Online diagnosis of RSNs
- Investigation of robust and fault-tolerant RSNs
This work is supported by the German Research Foundation (DFG) under grant WU 245/17-2 (2019-2021).
ES - Current Research Projects
Technology scaling makes it possible to implement systems with hundreds of processing cores, and thousands in the future, on a single chip. The communication in such systems is enabled by Networks-on-Chips (NoCs). A downside of technology scaling is the increased susceptibility to failures emerging in NoC resources during operation. Ensuring reliable operation despite such failures degrades NoC performance and may even invalidate the performance benefits expected from scaling. Thus, it is not enough to analyze performance and reliability in isolation, as usually done. Instead, we research how both aspects can be treated together using the concept of performability and its analysis with Markov reward models. In addition to developing modelling and analysis techniques, we exemplify our methodology through application to compare various NoC topologies and fault-tolerant routing algorithms. We investigate how performability develops with scaling towards larger NoCs and explore the limits of scaling by determining the break-even failure rates under which scaling can achieve net performability increase.