Semiconductor technology scaling impels the integration of highly diversified computer architectures with different kinds of processing cores, communication channels and embedded memories. Besides classic CPU and data-parallel GPU cores, runtime reconfigurable units emerge as integrated part of such architectures. These runtime reconfigurable heterogeneous computer architectures will offer very high computational performance for scientific computing and simulation applications, since they will close the gap between serial or coarse-grained parallel tasks on CPU cores and highly data-parallel tasks on GPU cores.
To utilize the potential of these innovative computer architectures, flexible and intelligent runtime systems are required, which manage system resources and both scheduling and execution of compute modules (CM) on the different computer architectures.
The goal of this thesis is the development of such a runtime system for scientific simulation applications on reconfigurable heterogeneous computer architectures. This comprises:
- Development of an appropriate runtime system architecture (stand-alone or client/server)
- Development of basic scheduling mechanisms and generic compute modules
- Development of control, scheduling and communication interfaces for compute modules (CM)
- Evaluation and comparison of different scheduling techniques (Work Stealing, Hungarian Scheduling, etc.)
- Selection and integration of fault tolerance measures, e.g. for redundant CM execution
- Implementation of a demonstrator consisting of 4-5 simple compute modules
- Evaluation of the developed runtime system with respect to performance overhead and communication latency
Conditions and requirements for this thesis:
- Profound programming skills in C/C++, NVIDIA CUDA and OpenCL
- Basic programming skills in OpenMP and MPI
- Profound understanding of modern CPU and GPU many-core architectures
- Lectures on computer architecture, operating systems, distributed systems will be helpful