Erik Brockmeijer

Existing platforms nearly always have more than one layer in their memory subsystem. These layers are inserted to bridge the enormous performance, latency and energy consumption gap between the large off-chip memories and the processor. Memory hierarchy layers can contain normal (software controlled) memories or caches. An application has to be mapped efficiently on this memory hierarchy. Often this requires that smaller copies are made from larger data arrays which can be stored in the smaller layers. A sophisticated analysis is required to find the copy possibilities. Then those copies must be selected such that they minimize the miss cost of all the layers globally. This happens most efficiently under software control because a compiler can take a global view. In the case of local memories, copy operations should be explicitly present in the code. However, in the case of hardware controlled caches, the cache controller will make the copies of signals at the moment they are accessed (and the copy is not present in the cache yet). So the code must be written such that the controller is forced to make the right decision. A further refinement of the application can be obtained by exploiting parallel data accesses to different memories. This is crucial for meeting the real-time constraints with a customized memory organisation without counteracting the memory size and energy budget optimizations achieved by the earlier step. There is not a single solution in mapping to and define a memory subsystem. Many trade offs exist and must be carefully evaluated.

The Phd focuses on exposing the trade offs for the definition and mapping applications to an (existing) memory hierarchy in terms of power, speed and area. Moreover, the required application analysis and exploration will be examined in depth. The goal is to come up with techniques and tools which assist a designer to explore the huge design space.