Projektpartner

Zur Webseite der Uni Stuttgart

ROCK: Robuste On-Chip-Kommunikation durch hierarchische Online-Diagnose und -Rekonfiguration

08.2011 - 12.2015, DFG-Projekt: WU 245/12-1    

Projektbeschreibung

Ziel des Projekts ROCK ist es, robuste Architekturen und zugehörige Entwurfsverfahren für Networks-on-Chip (NoC) zu untersuchen und prototypisch zu entwickeln, um der mit steigender Integrationsdichte zunehmenden Störanfälligkeit der On-Chip-Kommunikationsinfrastruktur gegenüber Umgebungsstrahlung, Übersprechen, Fertigungsvariabilitäten und Alterungseinflüssen zu begegnen. Dazu wird ein Ansatz verfolgt, der im Betrieb (online) Fehlerdiagnose und zielgerichtete Rekonfiguration zur Fehlerbehebung in hierarchischer Weise über die Netzwerkschichten durchführt und dabei schichtenübergreifend eine optimale Kombination von Maßnahmen auswählt. Die Optimalität umfasst die energieminimale Einhaltung von Zusicherungen bezüglich der Performability des Netzwerks, welche unter Einbeziehung der Kommunikationsperformanz und der Fehlerstatistik für das Forschungsgebiet der NoCs neu zu definieren ist. Weitere Anforderungen bestehen in der fehlertoleranten Auslegung der Diagnose- und Rekonfigurationssteuerung sowie in ihrer Transparenz für die über das NoC kommunizierenden Anwendungsprozesse. Die NoC-Architekturen und -Verfahren sind bezüglich Optimalität und Randbedingungen auch im Fehlerfall zu bewerten. Diese Bewertung beruht auf zu schaffenden funktionalen Fehlermodellen, welche mit Netzwerkmodellen zu einer NoC-Fehlersimulation integriert werden.

 


Aktivitäten

  • H.-J. Wunderlich: "Fault Tolerance Meets Diagnosis", Keynote at the 21st IEEE International On-Line Testing Symposium (IOLTS), Elia, Halkidiki, Greece, July 6-8, 2015

 

 

Publikationen

Zeitschriften und Konferenzberichte
Matching entries: 0
settings...
7. Multi-Layer Diagnosis for Fault-Tolerant Networks-on-Chip
Schley, G., Dalirsani, A., Eggenberger, M., Hatami, N., Wunderlich, H.-J. and Radetzki, M.
IEEE Transactions on Computers
Vol. 66(5), 1 May 2017, pp. 848-861
2017
DOI PDF 
Keywords: Networks-on-Chip, NoC, Diagnosis, Performance, Multi-layer, Design Space Exploration
Abstract: In order to tolerate faults that emerge in operating Networks-on-Chip, diagnosis techniques are employed for fault detection and localization. On various network layers, diverse diagnosis methods can be employed which differ in terms of their impact on network performance (e.g. by operating concurrently vs. pre-empting regular network operation) and the quality of diagnostic results. In this contribution, we show how diagnosis techniques of different network layers of a Network-on-Chip can be combined into multi-layer solutions. We present the cross-layer information flow used for the interaction between the layers and show the resulting benefit of the combination compared to layer-specific diagnosis. For evaluation, we investigate the diagnosis quality and the impact on system performance to explore the entire design space of layer-specific techniques and their multi-layer combinations. We identify pareto-optimal combinations that offer an increase of system performance by a factor of four compared to the single-layer diagnosis.
BibTeX:
@article{SchleDEHWR2017,
  author = {Schley, Gert and Dalirsani, Atefe and Eggenberger, Marcus and Hatami, Nadereh and Wunderlich, Hans-Joachim and Radetzki, Martin},
  title = {{Multi-Layer Diagnosis for Fault-Tolerant Networks-on-Chip}},
  journal = {IEEE Transactions on Computers},
  year = {2017},
  volume = {66},
  number = {5},
  pages = {848--861},
  keywords = {Networks-on-Chip, NoC, Diagnosis, Performance, Multi-layer, Design Space Exploration},
  abstract = {In order to tolerate faults that emerge in operating Networks-on-Chip, diagnosis techniques are employed for fault detection and localization. On various network layers, diverse diagnosis methods can be employed which differ in terms of their impact on network performance (e.g. by operating concurrently vs. pre-empting regular network operation) and the quality of diagnostic results. In this contribution, we show how diagnosis techniques of different network layers of a Network-on-Chip can be combined into multi-layer solutions. We present the cross-layer information flow used for the interaction between the layers and show the resulting benefit of the combination compared to layer-specific diagnosis. For evaluation, we investigate the diagnosis quality and the impact on system performance to explore the entire design space of layer-specific techniques and their multi-layer combinations. We identify pareto-optimal combinations that offer an increase of system performance by a factor of four compared to the single-layer diagnosis. },
  doi = {http://dx.doi.org/10.1109/TC.2016.2628058},
  file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2017/TC_SchleDEHWR2017.pdf}
}
6. Multi-Layer Test and Diagnosis for Dependable NoCs
Wunderlich, H.-J. and Radetzki, M.
Proceedings of the 9th IEEE/ACM International Symposium on Networks-on-Chip (NOCS'15), Vancouver, BC, Canada, 28-30 September 2015
2015
DOI PDF 
Keywords: Test, diagnosis, fault tolerance, network-on-chip, cross-layer
Abstract: Networks-on-chip are inherently fault tolerant or at least gracefully degradable as both, connectivity and amount of resources, provide some useful redundancy. These properties can only be exploited extensively if test and diagnosis techniques support fault detection and error containment in an optimized way. On the one hand, all faulty components have to be isolated, and on the other hand, remaining fault-free functionalities have to be kept operational.
In this contribution, behavioral end-to-end error detection is considered together with functional test methods for switches and gate level diagnosis to locate and to isolate faults in the network in an efficient way with low time overhead.
BibTeX:
@inproceedings{WundeR2015,
  author = {Wunderlich, Hans-Joachim and Radetzki, Martin},
  title = {{Multi-Layer Test and Diagnosis for Dependable NoCs}},
  booktitle = {Proceedings of the 9th IEEE/ACM International Symposium on Networks-on-Chip (NOCS'15)},
  year = {2015},
  keywords = { Test, diagnosis, fault tolerance, network-on-chip, cross-layer },
  abstract = {Networks-on-chip are inherently fault tolerant or at least gracefully degradable as both, connectivity and amount of resources, provide some useful redundancy. These properties can only be exploited extensively if test and diagnosis techniques support fault detection and error containment in an optimized way. On the one hand, all faulty components have to be isolated, and on the other hand, remaining fault-free functionalities have to be kept operational. 
In this contribution, behavioral end-to-end error detection is considered together with functional test methods for switches and gate level diagnosis to locate and to isolate faults in the network in an efficient way with low time overhead.}, doi = {http://dx.doi.org/10.1145/2786572.2788708}, file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2015/NOCS_WundeR2015.pdf} }
5. On Covering Structural Defects in NoCs by Functional Tests
Dalirsani, A., Hatami, N., Imhof, M.E., Eggenberger, M., Schley, G., Radetzki, M. and Wunderlich, H.-J.
Proceedings of the 23rd IEEE Asian Test Symposium (ATS'14), Hangzhou, China, 16-19 November 2014, pp. 87-92
2014
DOI PDF 
Keywords: Network-on-Chip (NoC), Functional Test, Functional Failure Modeling, Fault Classification, Boolean Satisfiability (SAT)
Abstract: Structural tests provide high defect coverage by considering the low-level circuit details. Functional test provides a faster test with reduced test patterns and does not imply additional hardware overhead. However, it lacks a quantitative measure of structural fault coverage. This paper fills this gap by presenting a satisfiability based method to generate functional test patterns while considering structural faults. The method targets NoC switches and links, and it is independent of the switch structure and the network topology. It can be applied for any structural fault type as it relies on a generalized structural fault model.
BibTeX:
@inproceedings{DalirHIESRW2014,
  author = {Dalirsani, Atefe and Hatami, Nadereh and Imhof, Michael E. and Eggenberger, Marcus and Schley, Gert and Radetzki, Martin and Wunderlich, Hans-Joachim},
  title = {{On Covering Structural Defects in NoCs by Functional Tests}},
  booktitle = {Proceedings of the 23rd IEEE Asian Test Symposium (ATS'14)},
  year = {2014},
  pages = {87--92},
  keywords = {Network-on-Chip (NoC), Functional Test, Functional Failure Modeling, Fault Classification, Boolean Satisfiability (SAT)},
  abstract = {Structural tests provide high defect coverage by considering the low-level circuit details. Functional test provides a faster test with reduced test patterns and does not imply additional hardware overhead. However, it lacks a quantitative measure of structural fault coverage. This paper fills this gap by presenting a satisfiability based method to generate functional test patterns while considering structural faults. The method targets NoC switches and links, and it is independent of the switch structure and the network topology. It can be applied for any structural fault type as it relies on a generalized structural fault model.},
  doi = {http://dx.doi.org/10.1109/ATS.2014.27},
  file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2014/ATS_DalirHIESRW2014.pdf}
}
4. Area-Efficient Synthesis of Fault-Secure NoC Switches
Dalirsani, A., Kochte, M.A. and Wunderlich, H.-J.
Proceedings of the 20th IEEE International On-Line Testing Symposium (IOLTS'14), Platja d'Aro, Catalunya, Spain, 7-9 July 2014, pp. 13-18
2014
DOI PDF 
Keywords: Network-on-Chip, self-checking, fault-secure, online testing, concurrent error detection
Abstract: This paper introduces a hybrid method to synthesize area-efficient fault-secure NoC switches to detect all errors resulting from any single-point combinational or transition fault in switches and interconnect links. Firstly, the structural faults that are always detectable by data encoding at flit-level are identified. Next, the fault-secure structure is constructed with minimized area such that errors caused by the remaining faults are detected under any given input vector. The experimental evaluation shows significant area savings compared to conventional fault-secure schemes. In addition, the resulting structure can be reused for test compaction. This reduces the amount of test response data and test time without loss of fault coverage or diagnostic resolution.
BibTeX:
@inproceedings{DalirKW2014,
  author = {Dalirsani, Atefe and Kochte, Michael A. and Wunderlich, Hans-Joachim},
  title = {{Area-Efficient Synthesis of Fault-Secure NoC Switches}},
  booktitle = {Proceedings of the 20th IEEE International On-Line Testing Symposium (IOLTS'14)},
  year = {2014},
  pages = {13--18},
  keywords = {Network-on-Chip, self-checking, fault-secure, online testing, concurrent error detection},
  abstract = {This paper introduces a hybrid method to synthesize area-efficient fault-secure NoC switches to detect all errors resulting from any single-point combinational or transition fault in switches and interconnect links. Firstly, the structural faults that are always detectable by data encoding at flit-level are identified. Next, the fault-secure structure is constructed with minimized area such that errors caused by the remaining faults are detected under any given input vector. The experimental evaluation shows significant area savings compared to conventional fault-secure schemes. In addition, the resulting structure can be reused for test compaction. This reduces the amount of test response data and test time without loss of fault coverage or diagnostic resolution.},
  doi = {http://dx.doi.org/10.1109/IOLTS.2014.6873662},
  file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2014/IOLTS_DalirKW2014.pdf}
}
3. Structural Software-Based Self-Test of Network-on-Chip
Dalirsani, A., Imhof, M.E. and Wunderlich, H.-J.
Proceedings of the 32nd IEEE VLSI Test Symposium (VTS'14), Napa, California, USA, 13-17 April 2014
2014
DOI URL PDF 
Keywords: Network-on-Chip (NoC), Software-Based Self-Test (SBST), Automatic Test Pattern Generation (ATPG), Boolean Satisfiability (SAT)
Abstract: Software-Based Self-Test (SBST) is extended to the switches of complex Network-on-Chips (NoC). Test patterns for structural faults are turned into valid packets by using satisfiability (SAT) solvers. The test technique provides a high fault coverage for both manufacturing test and online test.
BibTeX:
@inproceedings{DalirIW2014,
  author = {Dalirsani, Atefe and Imhof, Michael E. and Wunderlich, Hans-Joachim},
  title = {{Structural Software-Based Self-Test of Network-on-Chip}},
  booktitle = {Proceedings of the 32nd IEEE VLSI Test Symposium (VTS'14)},
  year = {2014},
  keywords = {Network-on-Chip (NoC), Software-Based Self-Test (SBST), Automatic Test Pattern Generation (ATPG), Boolean Satisfiability (SAT)},
  abstract = {Software-Based Self-Test (SBST) is extended to the switches of complex Network-on-Chips (NoC). Test patterns for structural faults are turned into valid packets by using satisfiability (SAT) solvers. The test technique provides a high fault coverage for both manufacturing test and online test.},
  url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6818754},
  doi = {http://dx.doi.org/10.1109/VTS.2014.6818754},
  file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2014/VTS_DalirIW2014.pdf}
}
2. SAT-based Code Synthesis for Fault-Secure Circuits
Dalirsani, A., Kochte, M.A. and Wunderlich, H.-J.
Proceedings of the 16th IEEE Symp. Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT'13), New York City, NY, USA, 2-4 October 2013, pp. 38-44
2013
DOI URL PDF 
Keywords: Concurrent error detection (CED), error control coding, self-checking circuit, totally self-checking (TSC)
Abstract: This paper presents a novel method for synthesizing fault-secure circuits based on parity codes over groups of circuit outputs. The fault-secure circuit is able to detect all errors resulting from combinational and transition faults at a single node. The original circuit is not modified. If the original circuit is non-redundant, the result is a totally self-checking circuit. At first, the method creates the minimum number of parity groups such that the effect of each fault is not masked in at least one parity group. To ensure fault-secureness, the obtained groups are split such that no fault leads to silent data corruption. This is performed by a formal Boolean satisfiability (SAT) based analysis. Since the proposed method reduces the number of required parity groups, the number of two-rail checkers and the complexity of the prediction logic required for fault-secureness decreases as well. Experimental results show that the area overhead is much less compared to duplication and less in comparison to previous methods for synthesis of totally self-checking circuits. Since the original circuit is not modified, the method can be applied for fixed hard macros and IP cores.
BibTeX:
@inproceedings{DalirKW2013,
  author = {Dalirsani, Atefe and Kochte, Michael A. and Wunderlich, Hans-Joachim},
  title = {{SAT-based Code Synthesis for Fault-Secure Circuits}},
  booktitle = {Proceedings of the 16th IEEE Symp. Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT'13)},
  year = {2013},
  pages = {38--44},
  keywords = {Concurrent error detection (CED), error control coding, self-checking circuit, totally self-checking (TSC)},
  abstract = {This paper presents a novel method for synthesizing fault-secure circuits based on parity codes over groups of circuit outputs. The fault-secure circuit is able to detect all errors resulting from combinational and transition faults at a single node. The original circuit is not modified. If the original circuit is non-redundant, the result is a totally self-checking circuit. At first, the method creates the minimum number of parity groups such that the effect of each fault is not masked in at least one parity group. To ensure fault-secureness, the obtained groups are split such that no fault leads to silent data corruption. This is performed by a formal Boolean satisfiability (SAT) based analysis. Since the proposed method reduces the number of required parity groups, the number of two-rail checkers and the complexity of the prediction logic required for fault-secureness decreases as well. Experimental results show that the area overhead is much less compared to duplication and less in comparison to previous methods for synthesis of totally self-checking circuits. Since the original circuit is not modified, the method can be applied for fixed hard macros and IP cores.},
  url = {http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6653580},
  doi = {http://dx.doi.org/10.1109/DFT.2013.6653580},
  file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2013/DFTS_DalirKW2013.pdf}
}
1. Structural Test and Diagnosis for Graceful Degradation of NoC Switches
Dalirsani, A., Holst, S., Elm, M. and Wunderlich, H.-J.
Journal of Electronic Testing: Theory and Applications (JETTA)
Vol. 28(6), October 2012, pp. 831-841
2012
DOI PDF 
Keywords: Network-on-Chip, Graceful Degradation, Logic Diagnosis, Performability
Abstract: Networks-on-Chip (NoCs) are implicitly fault tolerant and due to their inherent redundancy they can overcome defective cores, links and switches. This effect can be used to increase yield at the cost of reduced performance. In this paper, a new diagnosis method based on the standard flow of industrial volume testing is presented, which is able to identify the intact functions of a defective network switch rather than providing only a pass/fail result for the complete switch. To achieve this, the new method combines for the first time the precision of structural testing with information on the functional behavior in the presence of defects. This allows to disable defective parts of a switch after production test and use the intact functions. Thereby, only a minimum performance decrease is induced while the yield is increased. According to the experimental results, the method improves the performability of NoCs since 56.86 % and 72.42 % of defects in two typical switch models only impair one switch port. Unlike previous methods for implementing fault tolerant switches, the developed technique does not impose any additional area overhead and is compatible with many common switch designs.
BibTeX:
@article{DalirHEW2012,
  author = {Dalirsani, Atefe and Holst, Stefan and Elm, Melanie and Wunderlich, Hans-Joachim},
  title = {{Structural Test and Diagnosis for Graceful Degradation of NoC Switches}},
  journal = {Journal of Electronic Testing: Theory and Applications (JETTA)},
  publisher = {Springer-Verlag},
  year = {2012},
  volume = {28},
  number = {6},
  pages = {831--841},
  keywords = {Network-on-Chip, Graceful Degradation, Logic Diagnosis, Performability},
  abstract = {Networks-on-Chip (NoCs) are implicitly fault tolerant and due to their inherent redundancy they can overcome defective cores, links and switches. This effect can be used to increase yield at the cost of reduced performance. In this paper, a new diagnosis method based on the standard flow of industrial volume testing is presented, which is able to identify the intact functions of a defective network switch rather than providing only a pass/fail result for the complete switch. To achieve this, the new method combines for the first time the precision of structural testing with information on the functional behavior in the presence of defects. This allows to disable defective parts of a switch after production test and use the intact functions. Thereby, only a minimum performance decrease is induced while the yield is increased. According to the experimental results, the method improves the performability of NoCs since 56.86 % and 72.42 % of defects in two typical switch models only impair one switch port. Unlike previous methods for implementing fault tolerant switches, the developed technique does not impose any additional area overhead and is compatible with many common switch designs.},
  doi = {http://dx.doi.org/10.1007/s10836-012-5329-9},
  file = {http://www.iti.uni-stuttgart.de/fileadmin/rami/files/publications/2012/JETTA_DalirHEW2012.pdf}
}
Created by JabRef on 13/06/2017.
Workshop-Beiträge