Zur Webseite der Uni Stuttgart

RA - Abgeschlossene studentische Arbeiten

  • Master Thesis Nr. 00731-006: Evaluation of the fault tolerance of Artificial Neural Networks and Investigation of their precision requirements
    Nika Hamidi
    12.10.2016 - 12.04.2017

  • Master Thesis Nr. 00731-004: Development of a Model-based Characterization Software Framework for Approximate Computing Applications
    Alisa Kuzmina
    12.06.2016 - 13.12.2016

  • Master Thesis Nr. 00731-005: Inter-gate Fault Modeling for GPU-accelerated Fault Simulation
    Alice Frosi
    25.05.2016 - 24.11.2016

  • Master Thesis Nr. 00731-003: Adaptive Approximate Computing for Image Filtering using Dynamic Partial Reconfiguration
    Valentin Mihalcut
    03.05.2016 - 02.11.2016

  • Master Thesis Nr. 00731-001: Realistic Gate Model for Efficient Timing Analysis of Very Deep Submicron CMOS Circuits
    Deepthi Murali
    14.09.2015 - 15.03.2016

  • Projekt INF: Switching activity based estimation of IR-drop
    Pascal Hagemann, David Hardes, Moritz Knabben

  • Bachelor Thesis Nr. 183: Software basierter Selbsttest eingebetteter Speicher
    Felix Ebinger
    21.10.2014 -21.04.2015

  • Bachelor Thesis Nr. 182: Software basierter Selbsttest von Peripherie-Komponenten
    Jochen Bäßler
    21.10.2014 - 06.05.2015

  • Bachelor Thesis Nr. 179: Adaptierung an Zeitverhalten-Variationen in rekonfigurierbaren Hardwarestrukturen
    Sebastian Brandhofer
    20.10.2014 - 21.04.2015

  • Master Thesis Nr. 8: SAT-basierte Überprüfung der Fehlersicherheit von Schaltungen
    Maren Tilk

  • Bachelor Thesis Nr. 1 (SimTech): Portierung und Optimierung einer GPU Simulationsumgebung zur Untersuchung des apoptotischen Rezeptor-Clustering auf open CL
    Stefan Simeonov

  • Projekt INF: Untersuchung von hardwarebeschleunigten Anwendungen in rekonfigurierbaren Network-on-a-Chip-basierten Systemen
    Sebastian Brandhofer, Philipp Göttlich, Adrian Lanksweirt

  • Master Thesis Nr. 3580: Machine Learning Methods for Fault Classification
    Siddharth Sunil Gosavi
    23.10.2013 - 24.04.2014

  • Diplomarbeit Nr. 3576: Integration von algorithmenbasierter Fehlertoleranz in grundlegenden Operationen der Linearen Algebra auf GPGPUs
    Sebastian Halder
    16.10.2013 - 17.04.2014
  • Master Thesis Nr. 3239: Fault Tolerant Routing Algorithm for Fully- and Partially-defective NoC Switches
    Seyyed Mahdi Najmabadi
    01.09.2011 - 01.03.2012
  • Studienarbeit Nr. 2347: Parallele Partikelsimulation auf GPGPU-Architekturen zur Evaluierung von Apoptose-Signalwegen
    Alexander Schöll
    01.09.2011 - 02.03.2012
  • Bachelor Project Nr. 2334: Simulation of Realistic Defects for Validating Test and Diagnosis Algorithms
    Hossam Abouzeid Mohamed El Atali
    05.04.2011 - 31.08.2011
  • Diplomarbeit Nr. 3146: Strukturelle Feldtests bei komplexen ASICs
    Dominik Ull
    10.01.2011 - 09.08.2011
  • Studienarbeit Nr. 2306: CUDA-accelerated Delay Fault Simulation
    Eric Schneider
    1.11.2010 - 3.05.2011
  • Diplomarbeit Nr. 3069: Simulation Framework for Built-In Diagnosis of Self-Checking Circuits
    Laura Rodriguez Gomez
    19.06.2010 - 18.01.2011

ES - Abgeschlossene studentische Arbeiten

SS 2012

  • Michael Kaufmann

    Reliable Communication by Fault-Tolerant Multilayer Routing

    Modern supercomputers are highly parallel systems that scale up to several thousands of nodes. To provide fast communication in such systems, microprocessor vendors are integrating messaging units into their chips. These integrated network interfaces enable direct cache-to-cache communication between processor cores, providing low latency transmissions and high data throughput.

    Due to the high degree of parallelism, reliability and availability are becoming major concerns in supercomputer systems. Thus, mechanisms to tolerate component failures have to be provided. As the predominant topology of current supercomputers’ interconnection networks is that of a multidimensional torus, fault tolerance is implicitly supported by multiple redundant paths between nodes. This requires dynamic routing functions that can act on detected faults. However, area constraints and high clock frequencies restrict hardware-based routing functions to simple deterministic schemes. To circumvent these limitations, multilayer routing is used. Here, a second routing layer that is implemented in software is put on top of the simpler hardware routing.

    When resources like links or nodes fail, this second layer directs messages around faults by routing them over one or more intermediate nodes in software. The intermediate nodes are chosen such that they form a chain of valid hardware routing paths from source to destination. The solution developed here uses a compact representation of detected faults to minimize the overhead in terms of runtime and memory requirements. In addition, the selection process considers the additional load caused by re-routed traffic in order to keep the link load balanced. The implementation has been proven to work successfully on an IBM BlueGene/Q supercomputer.

WS 2011/12

  • Zixuan Cheng

    Transaction-Level Instruction Set Simulator of An ATMEL AVR Microcontroller Core (Master Thesis)

    Modern design flows require the simulation of software running on a CPU in a larger system context. For this purpose, an instruction set simulator (ISS) specific to the ATMEL AVR processor architecture shall be developed. To interface with the rest of the system simulation model, the ISS shall have a transaction-level interface. To transform AVR assembler code (generated with a given cross compiler from, e.g., C/C++ sources) into a representation suitable for compiled instruction set simulation, a preprocessor has to be developed. As time permits, the implementation of an interface with an IDE / debugger (AVR Studio or GNU gdb) is desirable.
    The thesis is performed in our Embedded Systems Lab in close cooperation with ATMEL, Heilbronn, as part of the research project ROBUST. Post-thesis job opportunities with ATMEL exist.

  • Nikolaos Batzolis

    Fault-tolerant End-to-End Flow Control Protocol for Networks-On-Chip (NoC) (Master Thesis)

    On-chip networks (Networks-on-Chip, NoC) are communication networks, which provide predominantly packet-switch communication between processing elements of an embedded system. With the ongoing decrease of feature size, complex systems with hundreds of processing elements can be implemented on a single chip. On the other hand, decreasing feature sizes incurs the serious drawback of higher susceptibility to manufacturing tolerances and external influences, resulting in an increased chip fault probability. The presence of faulty components or communication links inside NoC-enabled chips can lead to data corruption or packet loss.
    In the near future, NoCs will be used to implement safety-critical applications. The loss of packets or corruption of data during communication of network elements may cause the system to no longer maintain its correct behavior or even may cause the system to fail its operation completely. Such deviation from the specified behavior can damage devices irreparably or even may result in loss of people's life. For that reason, fault free communication between processing elements is a primary concern, which can be achieved by ensuring that every packet reaches its destination even in presence of permanent errors.


  • Adán Kohler

    Modellierung und Simulation von Networks-on-Chip auf der Transaktionsebene

    Networks-on-Chip (NoC) dienen der Kommunikation zwischen Prozessorelementen von Multiprozessor-Systems-on-Chip (MPSoC). Beim Entwurf von NoCs müssen Netzwerktopologien, Routingmechanismen und weitere Aspekte des Netzwerks so ausgewählt werden, dass die Kommunikationsanforderungen zu implementierender Anwendungen erfüllbar sind. Um dies bewerten zu können, ist eine Simulation des Netzwerks unter Einbeziehung des Kommunikationsverhaltens der Prozessorelemente erforderlich. Für busbasierte Systeme wurde die Transaktionsebenen-Modellierung und -Simulation entwickelt, welches Kommunikationsoperationen zu sogenannten Transaktionen zusammenfasst und durch Abstraktion von Protokolldetails (z.B. einzelne Signale) eine höhere Simulationsperformance erzielt. In dieser Diplomarbeit soll das Transaktionskonzept nun zur Modellierung von NoCs angewandt und, falls erforderlich, angepasst werden. Dabei kann auf die Simulationsbibliothek SystemC sowie die TLM2.0-Bibliothek für die Transaktionsebenensimulation aufgesetzt werden. Es soll ein geeigneter Rahmen, etwa in Form einer NoC-Simulationsbibliothek mit definierten Interfaces, geschaffen werden, der es den Anwendern erlaubt, die Details einer NoC-Architektur (Topologie, Routing etc.) selbst zu definieren.

WS 2007/08 and older

WS 2007/08

  • George Raju

    Transaction Level Modelling of H.264 Decoding Processes

    The standard H.264 / MPEG-4 part 10 defines an encoded representation of digital video sequences and its decoding process. The decoding process is implemented as software in the JM reference model. Due to its sequential nature, the JM reference is not well-suited as a reference against which a parallel hardware implementation of a H.264 decoder could be verified. The subject of this thesis is the design of a parallel reference model of H.264 decoding in SystemC. The model shall be designed at the Transaction Level of abstraction.

SS 2007

  • Ms. Weining Hao

    Architecture and Implementation of a H.264 Deblocking Accelerator

    The standard H.264 / MPEG-4 part 10 defines an encoded representation of digital video sequences and its decoding process. This process includes a deblocking sub-process to reduce the visual impact of block artefacts. Different to previous video coding standards, H.264 deblocking is part of the decoding loop ("in-loop filter"). The de-blocked video frames serve as a reference for the decoding of other frames that are decoded later. Therefore, the deblocking process is time-critical. Furthermore, deblocking is known to contribute about one third to the performance requirements of H.264 decoding. The subject of this thesis is the design of a hardware accelerator for H.264 deblocking that can speed up the execution of an otherwise software-based decoder.

  • Thomas Bruni

    A Formalized Approach to Transaction Level Modeling

    In transaction level modeling (TLM), high simulation speed is achieved by modeling at higher levels of abstraction than signals and the RTL. The level of abstraction in which modeling is performed depends on the context in which a model is used and the required level of accuracy. The levels of accuracy required in most modeling activities have been identified and proposed by some researchers and institutes active in the TLM field. For example, the OSCI TLM approach proposes PV (Programmer's View), PVT (Programmer's View with Timing), CX (Cycle Approximate) and CA (Cycle Accurate) abstraction levels, in increasing order of precision and decreasing order of simulation speed. However, these definitions of the abstraction levels are informal and the transition from one abstraction level to another is not systematic or automatizable. For example, although transaction level models of a bus at different abstraction levels represent the same underlying communication protocol, the CX, CA and PVT models are often developed independently with little or no reuse. The objective of this Thesis is development of a more formal, generic modeling approach for modeling of buses, so that based on a single formal description (e.g. communicating state machines), models at different abstraction levels can be generated in a systematic and potentially automatizable manner. The proposed approach shall be validated using an existing bus protocol, and the final executable models shall be implemented in SystemC.

  • Muhammad Shaharyar Awan

    Transaction Level Power and Timing Exploration of Bus Architectures

    In modern embedded systems, low power consumption is an increasingly important factor that should be taken into account when exploring the design space. Limited energy resources such as batteries, size constraints and limited cooling possibilities have motivated power aware design techniques, which in addition to performance and timing, take the power consumption limitations into account. Low power design at lower levels (i.e. physical, gate and transistor levels) has been extensively studied and successfully applied to complex integrated circuits such as microprocessors. A recent trend is system-level power aware design, in which power consumption is analyzed and optimized at higher levels. For example, software optimization techniques which reduce cache misses and hence result in fewer external memory accesses and lower power consumption. Another example is power consumption of buses, where factors such as the number of transitions on the address, data and control lines directly affect the power consumption. Therefore, factors such as arbitration policies and address/data coding schemes can be used to control the power consumption associated with a bus. The objective of this thesis is conception and development of an OSCI-TLM based framework for unified power and timing exploration. The focus is on the bus model and the effect of different arbitration policies on timing and power consumption. A model of an existing bus protocol shall be developed. For masters and slaves, generic models with simple power models (e.g. simple traffic pattern generators for masters and memory modules for slaves) shall be implemented and used in the experiments.

  • Adán Kohler Studienarbeit

    Portierung und Optimierung einer H.264-Dekodier-Software für ein eingebettetes System

    Der Standard H.264 / MPEG-4 Part 10 definiert eine kodierte Repräsentation für digitalisierte Videosequenzen und einen dazugehörigen Dekodierprozess für verschiedene Bildauflösungen (Levels) und mit verschiedenen Kombinationen alternativer Kodierverfahren (Profiles). Der Dekodierprozess ist (mit Einschränkungen bezüglich Profiles und Levels) durch die Open Source Software X264 implementiert. Aufgabe dieser Studienarbeit ist es, diese für Desktop-Rechner geschriebene Software auf ein Embedded Development Board (ARM Versatile Platform Board mit ARM926EJ-S Prozessor) zu portieren. Ferner soll eine Beschleunigung der Dekodierung erreicht werden, indem ein Teilprozess - die sogenannte Deblocking-Filterung - an eine anwendungsspezifische integrierte Schaltung delegiert wird.

WS 2006/07

  • Rauf Salimi Khaligh

    Transaktionsbasierte Simulation von ARM Plattformen

    ARM ist eine Familie von Mikroprozessoren, die häufig in eingebetteten Systemen verwendet werden. Solche Systeme beinhalten Hardware Accelerators, Peripherieeinheiten und Speicher, die mittels eines BUS-Systems an den ARM-Prozessor angeschlossen sind und zusammen eine so genannte Plattform bilden. Thema Ihrer Diplomarbeit wird die Entwicklung eines effizienten Simulationssystems für eine solche Plattform sein, basierend auf Transaction Level Modellierung mit SystemC. Der ARM Instruction-Set-Simulator ("Armulator") soll in das Simulationssystem integriert werden. Eine Bibliothek von Modellen wie z. B. für Speicher und das AMBA Bus-System ist zu entwickeln. Das Simulationssystem soll in einer Beispielanwendung getestet werden.