Intermittent and Transient Fault Diagnosis on Sparse Code Signatures. Michael Kochte; Atefe Dalirsani; Andrea Bernabei; Martin Omana; Cecilia Metra and Hans-Joachim Wunderlich. In
Proceedings of the 24th IEEE Asian Test Symposium (ATS′15), Mumbai, India, 2015, pp. 157–162. DOI: https://doi.org/
10.1109/ATS.2015.34 Abstract
Failure diagnosis of field returns typically requires high quality test stimuli and assumes that tests can be repeated. For intermittent faults with fault activation conditions depending on the physical environment, the repetition of tests cannot ensure that the behavior in the field is also observed during diagnosis, causing field returns diagnosed as no-trouble-found. In safety critical applications, self-checking circuits, which provide concurrent error detection, are frequently used. To diagnose intermittent and transient faulty behavior in such circuits, we use the stored encoded circuit outputs in case of a failure (called signatures) for later analysis in diagnosis. For the first time, a diagnosis algorithm is presented that is capable of performing the classification of intermittent or transient faults using only the very limited amount of functional stimuli and signatures observed during operation and stored on chip. The experimental results demonstrate that even with these harsh limitations it is possible to distinguish intermittent from transient faulty behavior. This is essential to determine whether a circuit in which failures have been observed should be subject to later physical failure analysis, since intermittent faulty behavior has been diagnosed. In case of transient faulty behavior, it may still be operated reliably.BibTeX
Optimized Selection of Frequencies for Faster-Than-at-Speed Test. Matthias Kampmann; Michael A. Kochte; Eric Schneider; Thomas Indlekofer; Sybille Hellebrand and Hans-Joachim Wunderlich. In
Proceedings of the 24th IEEE Asian Test Symposium (ATS′15), Mumbai, India, 2015, pp. 109–114. DOI: https://doi.org/
10.1109/ATS.2015.26 Abstract
Small gate delay faults (SDFs) are not detectable at-speed, if they can only be propagated along short paths. These hidden delay faults (HDFs) do not influence the circuit’s behavior initially, but they may indicate design marginalities leading to early-life failures, and therefore they cannot be neglected. HDFs can be detected by faster-than-at-speed test (FAST), where typically several different frequencies are used to maximize the coverage. A given set of test patterns P potentially detects a HDF if it contains a test pattern sensitizing a path through the fault site, and the efficiency of FAST can be measured as the ratio of actually detected HDFs to potentially detected HDFs. The paper at hand targets maximum test efficiency with a minimum number of frequencies. The procedure starts with a test set for transition delay faults and a set of preselected equidistant frequencies. Timing-accurate simulation of this initial setup identifies the hard-to-detect faults, which are then targeted by a more complex timing-aware ATPG procedure. For the yet undetected HDFs, a minimum number of frequencies are determined using an efficient hypergraph algorithm. Experimental results show that with this approach, the number of test frequencies required for maximum test efficiency can be reduced considerably. Furthermore, test set inflation is limited as timing-aware ATPG is only used for a small subset of HDFs.BibTeX
Efficient Observation Point Selection for Aging Monitoring. Chang Liu; Michael A. Kochte and Hans-Joachim Wunderlich. In
Proceedings of the 21st IEEE International On-Line Testing Symposium (IOLTS′15), Elia, Halkidiki, Greece, 2015, pp. 176–181. DOI: https://doi.org/
10.1109/IOLTS.2015.7229855 Abstract
Circuit aging causes a performance degradation and eventually a functional failure. It depends on the workload and the environmental condition of the system, which are hard to predict in early design phases resulting in pessimistic worst case design. Existing delay monitoring schemes measure the remaining slack of paths in the circuit, but cause a significant hardware penalty including global wiring. More importantly, the low sensitization ratio of long paths in applications may lead to a very low measurement frequency or even an unmonitored timing violation. In this work, we propose a delay monitor placement method by analyzing the topological circuit structure and sensitization of paths. The delay monitors are inserted at meticulously selected positions in the circuit, named observation points (OPs). This OP monitor placement method can reduce the number of inserted monitors by up to 98% compared to a placement at the end of long paths. The experimental validation shows the effectiveness of this aging indication, i.e. a monitor issues a timing alert always earlier than any imminent timing failure.BibTeX
On-Line Prediction of NBTI-induced Aging Rates. Rafal Baranowski; Farshad Firouzi; Saman Kiamehr; Chang Liu; Mehdi Tahoori and Hans-Joachim Wunderlich. In
Proceedings of the ACM/IEEE Conference onDesign, Automation and Test in Europe (DATE′15), Grenoble, France, 2015, pp. 589–592. DOI: https://doi.org/
10.7873/DATE.2015.0940 Abstract
Nanoscale technologies are increasingly susceptible to aging processes such as Negative-Bias Temperature Instability (NBTI) which undermine the reliability of VLSI systems. Existing monitoring techniques can detect the violation of safety margins and hence make the prediction of an imminent failure possible. However, since such techniques can only detect measurable degradation effects which appear after a relatively long period of system operation, they are not well suited to early aging prediction and proactive aging alleviation. This work presents a novel method for the monitoring of NBTI-induced degradation rate in digital circuits. It enables the timely adoption of proper mitigation techniques that reduce the impact of aging. The developed method employs machine learning techniques to find a small set of so called Representative Critical Gates (RCG), the workload of which is correlated with the degradation of the entire circuit. The workload of RCGs is observed in hardware using so called workload monitors. The output of the workload monitors is evaluated on-line to predict system degradation experienced within a configurable (short) period of time, e.g. a fraction of a second. Experimental results show that the developed monitors predict the degradation rate with an average error of only 1.6% at 4.2% area overhead.BibTeX