PARSIVAL

PARSIVAL: Parallel High-Throughput Simulations for Efficient Nanoelectronic Design and Test Validation

10.2014 - 12.2018, DFG-Projekt: WU 245/16-1

 

Project Description

In this project novel methods for versatile simulation-based VLSI design and test validation on high throughput data-parallel architectures will be developed, which enable a wide range of important state-of-the-art validation tasks for large circuits. In general, due to the nature of the design validation processes and due to the massive amount of data involved, parallelism and throughput-optimization are the keys for making design validation feasible for future industrial-sized designs. The main focus and key features lie in the structure of the simulation model, the abstraction level and the used algorithms, as well as their parallelization on data-parallel architectures. The simulation algorithms should be kept simple to run fast, yet accurate enough to produce acceptable and valuable data for cross-layer validation of complex digital systems.

Design and test validation is one of the most important and complex tasks within modern semi-conductor product development cycles. The design validation processes analyze and assess a developed design with respect to certain validation targets to ensure the compliance with given specifications and customer requirements. With thorough validation, test strategies can be assessed to increase test quality and the quality of the products delivered. The type of specification and requirements can range from the abstract high-level functional behavior of the circuit towards constraints of parameters at lower levels, such as peak power consumption or transistor stress. With the process scaling, more complex defect mechanisms arise and more and more low level parameters have to be considered, which is why validation at lower levels has become an essential part in current manufacturing processes. Yet, state-of-the-art algorithms rely on compute-intensive simulations being unable to scale on traditional computing architectures as a result of the increasing complexity of the designs and the required model accuracy. Over the past years, data-parallel architectures, such as Graphics Processing Units (GPUs), have evolved and introduced the many-core paradigm, which well established in general purpose computing. Current architectures nowadays provide thousands of processors on a single chip and are capable of achieving massive computational throughput of several Teraflops. By exploiting highly parallel hardware acceleration and careful abstraction, we strive for maximum throughput in order to enable a wide range of complex electronic design automation (EDA) applications applicable to industrial-sized designs.


Development of Simulation Models for accurate Validation

The abstraction levels to be considered during the first project phase cover descriptions from gate-level to electrical level and incorporate all the information required for an accurate evaluation of the circuit behavior and its functional properties. The evaluation models must be sufficiently comprehensive to describe all significant electrical parameters that have impact on the logic behavior and timing, but must still support an extremely efficient parallel algorithm environment. Instead of relying on continuous computation of differential equations as common in lower level simulation tools such as SPICE, the algorithm makes use of piecewise approximations of the electrical behavior through linearization in order to model functional properties and to compute the signal values. This offers an attractive alternative in terms of the tradeoff between achievable precision and computational effort.

In addition to the ideal logic and timing behavior, the functional model has to be extended to consider the impact of parasitic and external parameters including: Modeling the power grid, cross-talk, the impact of temperature and environmental influences.

Non-functional properties (NFPs) have to be evaluated over a very wide range of different time scales. Computing the vulnerability of circuit structures with respect to soft-errors is subject of single effects in the range of picoseconds. Circuit robustness is related to noise, inductivity or signal integrity whose duration is in the order of nanoseconds. Current and power-consumption have to be evaluated at this scale as well, while temperature may be evaluated over the range of milliseconds due to the heat capacitance and heat transfer functions of the device under evaluation. However, reliability with respect to wear-out mechanisms and aging has to be analyzed at the scale of months or years and requires a completely different approach. All these NFPs have in turn direct impact on the functional properties and have to be evaluated in an integrated way.


Massively Parallel Validation Algorithms

The developed simulation models will be evaluated by optimized algorithms for massively parallel compute structures like GPUs. On such architectures, high throughput is achieved by parallelizing computations and maximizing the number of occupied computational resources during runtime. Exploiting parallelism in various dimensions requires that the simulation algorithms will be tuned for the targeted data-parallel architectures. For an optimal result, this comprises not only a thorough understanding of the underlying architecture and instruction sets, but also requires a general and flexible algorithm design.

Parallelism will be exploited in many ways during evaluation:

  • Structural parallelism, induced by mutually independent nodes in a circuit, allows the concurrent evaluation of the independent nodes.
  • Multiple fault simulations can be simulated in parallel, e.g. when different fault propagation cones are involved such that interactions between the faults are prevented.
  • Pattern parallelism, a type of data-parallelism, allows the evaluation of a circuit for different patterns at once.
  • Instance parallelism takes advantage of circuit instances with different parameters, such as varying node delays, which can be evaluated at the same time for a given pattern or fault.
  • Task parallelism allows to execute different subtasks for an instance concurrently on the device to evaluate multiple parameters at the same time.

The validation tasks can be significantly accelerated by optimal combination and exploitation of these different dimensions of parallelism, yielding a maximized throughput. However, this requires a thorough scheduling of the computational tasks and an elaborate utilization of the computational resources of the underlying many-core GPU architecture.


Additional Information

This work is supported by the German Research Foundation (DFG) under grant WU 245/16-1.

Publications

  1. 2019

    1. SWIFT: Switch Level Fault Simulation on GPUs. Eric Schneider and Hans-Joachim Wunderlich. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 38, 1 (2019), pp. 122--135. DOI: https://doi.org/10.1109/TCAD.2018.2802871
    2. Multi-Level Timing and Fault Simulation on GPUs. Eric Schneider and Hans-Joachim Wunderlich. INTEGRATION, the VLSI Journal -- Special Issue of ASP-DAC 2018 64, (2019), pp. 78--91. DOI: https://doi.org/10.1016/j.vlsi.2018.08.005
  2. 2018

    1. Multi-Level Timing Simulation on GPUs. Eric Schneider; Michael A. Kochte and Hans-Joachim Wunderlich. In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC’18), Jeju Island, Korea, 2018, pp. 470--475. DOI: https://doi.org/10.1109/ASPDAC.2018.8297368
  3. 2017

    1. Probabilistic Sensitization Analysis for Variation-Aware Path Delay Fault Test Evaluation. Marcus Wagner and Hans-Joachim Wunderlich. In Proceedings of the 22nd IEEE European Test Symposium (ETS’17), Limassol, Cyprus, 2017, pp. 1--6. DOI: https://doi.org/10.1109/ETS.2017.7968226
    2. Aging Monitor Reuse for Small Delay Fault Testing. Chang Liu; Michael A. Kochte and Hans-Joachim Wunderlich. In Proceedings of the 35th VLSI Test Symposium (VTS’17), Caesars Palace, Las Vegas, Nevada, USA, 2017, pp. 1--6. DOI: https://doi.org/10.1109/VTS.2017.7928921
    3. GPU-Accelerated Simulation of Small Delay Faults. Eric Schneider; Michael A. Kochte; Stefan Holst; Xiaoqing Wen and Hans-Joachim Wunderlich. IEEE Transactions on Computer-Aided Design of Integrated  Circuits and Systems (TCAD) 36, 5 (2017), pp. 829--841. DOI: https://doi.org/10.1109/TCAD.2016.2598560
    4. Analysis and Mitigation of IR-Drop Induced Scan Shift-Errors. Stefan Holst; Eric Schneider; Koshi Kawagoe; Michael A. Kochte; Kohei Miyase; Hans-Joachim Wunderlich; Seiji Kajihara and Xiaoqing Wen. In Proceedings of the IEEE International Test Conference (ITC’17), Fort Worth, Texas, USA, 2017, pp. 1--8. DOI: https://doi.org/10.1109/TEST.2017.8242055
  4. 2016

    1. Timing-Accurate Estimation of IR-Drop Impact on  Logic- and Clock-Paths During At-Speed Scan Test. Stefan Holst; Eric Schneider; Xiaoqing Wen; Seiji Kajihara; Yuta Yamato; Hans-Joachim Wunderlich and Michael A. Kochte. In Proceedings of the 25th IEEE Asian Test Symposium (ATS’16), Hiroshima, Japan, 2016, pp. 19--24. DOI: https://doi.org/10.1109/ATS.2016.49
    2. High-Throughput Transistor-Level Fault Simulation on GPUs. Eric Schneider and Hans-Joachim Wunderlich. In Proceedings of the 25th IEEE Asian Test Symposium (ATS’16), Hiroshima, Japan, 2016, pp. 150--155. DOI: https://doi.org/10.1109/ATS.2016.9
  5. 2015

    1. Hochbeschleunigte Simulation von Verzögerungsfehlern unter Prozessvariationen. Eric Schneider; Michael A. Kochte and Hans-Joachim Wunderlich. In 27th GI/GMM/ITG Workshop “Testmethoden und Zuverlässigkeit von Schaltungen und Systemen” (TuZ’15), Bad Urach, Germany, 2015.
    2. Logic/Clock-Path-Aware At-Speed Scan Test Generation for Avoiding False Capture Failures and Reducing Clock Stretch. Koji Asada; Xiaoqing Wen; Stefan Holst; Kohei Miyase; Seiji Kajihara; Michael A. Kochte; Eric Schneider; Hans-Joachim Wunderlich and Jun Qian. In Proceedings of the 24th IEEE Asian Test Symposium (ATS’15), Mumbai, India, 2015, pp. 103–108. DOI: https://doi.org/10.1109/ATS.2015.25
    3. GPU-Accelerated Small Delay Fault Simulation. Eric Schneider; Stefan Holst; Michael A. Kochte; Xiaoqing Wen and Hans-Joachim Wunderlich. In Proceedings of the ACM/IEEE Conference onDesign, Automation and Test in Europe (DATE’15), Grenoble, France, 2015, pp. 1174--1179. DOI: https://doi.org/10.7873/DATE.2015.0077
    4. High-Throughput Logic Timing Simulation on GPGPUs. Stefan Holst; Michael E. Imhof and Hans-Joachim Wunderlich. ACM Transactions on Design Automation of Electronic Systems (TODAES) 20, 3 (2015), pp. 37:1--37:21. DOI: https://doi.org/10.1145/2714564
  6. 2014

    1. Data-Parallel Simulation for Fast and Accurate Timing Validation of CMOS Circuits. Eric Schneider; Stefan Holst; Xiaoqing Wen and Hans-Joachim Wunderlich. In Proceedings of the 33rd IEEE/ACM International Conferenceon Computer-Aided Design (ICCAD’14), San Jose, California, USA, 2014, pp. 17--23. DOI: https://doi.org/10.1109/ICCAD.2014.7001324

Workshop Contributions

  1. 2015

    1. Hochbeschleunigte Simulation von Verzögerungsfehlern unter Prozessvariationen. Eric Schneider; Michael A. Kochte and Hans-Joachim Wunderlich. In 27th GI/GMM/ITG Workshop “Testmethoden und Zuverlässigkeit von Schaltungen und Systemen” (TuZ’15), Bad Urach, Germany, 2015.

Contact

 

 

To the top of the page