Embedded Systems in Silicon (2005-2006)


Code : TD5102
Lecturer : Prof. Dr. Henk Corporaal
Email : H.Corporaal at tue.nl
Phone : TU/e: +31-40-247 5195 or 3653 (secr. TU/e) / 5462 (office TU/e)
NUS-ECE-DTI +65-6874 4188 (secr. DTI) / 4182 (office DTI)
Assistance: Dr. Yajun Ha: NUS-ECE E1-08-17, elehy at nus.edu.sg, tel 2258
MSc. Valentin Gheorghita (TU/e: LCC compiler): S.V.Gheorghita at tue.nl
Ir. Sander Stuijk (TU/e: SystemC and MIPS models): s.stuijk at tue.nl
Prerequisets: Course in computer architecture;
Programming experience in C, C++, or equivalent language

News and Update


Information on the course:

Objective

When looking at future embedded systems and their design, especially (but not exclusively) in the multi-media domain, we observe several problems: In order to solve these problems we propose to use programmable multi-processor platforms, having an advanced memory hierarchy, this together with an advanced design flow for these platforms. The treated design flow starts with an executable specification of your application and ends with a very efficient mapping (in terms of performance and energy consumption) on a certain platform. This course treats how to design these future embedded systems, solving the mentioned problems by using above solutions.
Note, we do not treat how to go from idea to an executable specification. For that high level part of the design flow see e.g. the DTI course TD5101 Specification of Complex Hardware/Software systems.

Topics

In this course we treat a selection of the following topics:

1. Overview

We start with discussing recent trends, platform developments, treat what do we mean by mapping, design space exploration, and give an overview of the design flow trajectory for embedded systems; in particular of streaming based systems, as found in the Multi-Media application domain.

2. Platform and platform components

As we already mentioned, we foresee that platforms will raise the abstraction level for future application and system designers. Platforms essentially are multi-processor based systems, often realized as a single chip. Depending on the application domain you will see a variety of different processors, including RISC, VLIW and domain specific accelerators. We will treat the following:

3. Data Management

In this part the emphasis is on efficient data management, exploiting the advanced memory hierarchy, achieving high performance and low power. We distinguish the management of dynamic (heap based) data structures, and (big) static data structures, like (multi-dimensional) arrays.
The student will learn how to apply a methodology for a step-wise (source code) transformation and mapping trajectory, going from an initial specification to an efficient and highly tuned implementation on a particular platform. The final implementation can be an order of magnitude more efficient in terms of cost, power, and performance.

4. Task Concurrency Management

In this part we will show you how to partition applications, and in particular what are good metrics to evaluate at a high level the quality of a certain task partition. This will be based on the Yapi programming language/environment and the CAST tooling from Sander Stuijk (TU/e).

5. Code generation

Platform based design abstracts from the underlying processor components. This means we have to rely on high quality compilers to bridge the gap between a high level language (like C or C++) and the processor ISA (instruction set architecture). Since many processors exploit some kind of instruction level parallelism, compilers have to be extended with a so-called scheduling phase. In this part we will discuss different scheduling algorithms, from Basic Block Scheduling, to Modulo Software Pipelining.


Book and Handouts

Computer Organization and Design
- The Hardware/Software Interface
3nd Edition


David A. Patterson and John L. Hennessy
Morgan Kaufmann Publishers
Above book will be used for the RISC architecture and processor implementation part (mainly chapters 2, 5 and 6). We can also recommend the companion book of Hennessy and Patterson: Computer Architecture, a quantitative approach (currently 3rd edition). This book discusses recent trends in computer architecture and advanced architecture concepts.
Besides above book we will distribute handouts in the form of papers and slides (see below).

 

Slides

Missing slides will be put on the web as we proceed during the course.

Papers and other documentation to read

Lab exercises

Will be partly updated during course !

A. MIPS assembly programming exercise

In this exercise we use the SPIM simulator to program the MIPS processor in assembly language. See the home page of the SPIM simulator (for MIPS R2000/R3000 architectures). You are asked to make and demonstrate a program implementing an interesting algorithm which at least contains either a non-leaf function and/or a recursive function. Give both the C-code and assembly code of your program.

B. Design and mapping to FPGA

We organize two labsessions. For details about these sessions see here.

C. Design and implementation of an embedded RISC processor

Note: the mapping to FPGA in this exercise is optional, but highly recommended. This means you have a couple of options: 1) do everything using SystemC simulation only; 2) using the mapping tools to FPGA, giving you e.g. area and timing estimates (this still can be done on a PC); 3) really get things running at the FPGA board (giving you the tremendous speed of an FPGA. You have to at least show the first option.

For this exercise we use two small MIPS processors: Both processors are described in SystemC, a language in which you can both describe hardware and software. See www.systemc.org. Study the user manual of SystemC which you can find on this website, especially the example in chapter two. The exercise contains 4 assignments, as described below; see for further details MIPS Lab exercises.
Note that for the NUS specific instruction we had to change the documentation about the mMIPS and specifically its design flow; therefore see MIPS Lab at NUS .

D. Optimize the Memory Use of C code algorithm

In this exercise you are asked to optimize a C algorithm by using the discussed data management techniques. This should result into an implementation which shows a much improved memory behavior. This improves performance and energy consumption. In this exercise we mainly concentrate on reducing energy consumption. You need to download the following, and follow the instructions:

Interesting links and other material

Examination

The examination will be oral; dates likely in the second week of January 2006. Grading will be 50 % based on your lab. exercises and reports, 50 % on the theory discussed during the lectures.


Back to homepage of Henk Corporaal