Advanced Computer Architecture

Course year: 2010-2011

Code : 5MD00 (3 ECTS)  /  5Z033 (4 ECTS)
Lecturers : Prof. dr. Henk Corporaal
Prof. dr. ir. R.H.J.M. Otten
Email : H.Corporaal at tue.nl
Phone : TU/e: +31-40-247 5195 or 3653 (secr. TU/e) / 5462 (office TU/e)
Assistance: MSc. Yifan He: y.he at tue.nl
MSc. Dongrui She: d.she at tue.nl
Dr. ir. Sander Stuijk (for computer infrastructure): s.stuijk at tue.nl
Dr. Akash Kumar: a.kumar at tue.nl
Prerequisets: Course in computer architecture and processor design; e.g. Processor Design 5Z032 or Computation 5JJ70;
Programming experience in C, C++, or equivalent language

News

Information on the course:

Description and objectives

Studying the architecture, organization and use of the newest general purpose (micro)processors currently on the market, and the latest research developments in computer architecture. Architectures exploiting instruction-level parallelism (ILP), thread-level and task-level parallelism are treated. Starting from basic architecture concepts we will end with discussing the latest commercial processors (e.g., Pentium 4 multi-core, EPIC processors like Itanium, and embedded processors such as the TriMedia), and academic processors (like TRIPS).
This course also treats how processors can be combined in a multiprocessing platform, e.g. by using a Network-on-Chip. Interprocessor communication issues will be dealt with. Furthermore new code generation techniques needed for exploiting ILP will be treated. Special emphasis will be on quantifying design decisions in terms of performance and cost.

The intention of the course is to give students the ability to understand the design principles and operation of new (multi-)processor architectures, and evaluate them both qualitatively and quantitatively. Although we treat several examples, the emphasis will be on architecture concepts. Furthermore, the aim is to design, implement and test a Network-on-Chip, by one or more student teams.

Topics:

Basic principles (like instruction set design), pipelining and its consequences; VLIW (very long instruction word) architectures, Superpipelined, Superscalar, SIMD (single instruction, multiple data, used in vector and sub-wordparallel processors) and MIMD (multiple instruction, multiple data) architectures; SMT (Simultaneous Multi-Threading); Out-of-order and speculative execution; Branch prediction; Data (value) prediction; Design of advanced memory hierarchies; Memory coherency and consistency; Multi-threading; Exploiting task-level and instruction-level parallelism; Inter-processor communication models; Input and output; Network Communication Architecture; and Networks-on-Chip.

Book and Handouts

Computer Architecture: A Quantitative Approach; 4th ed.

John L. Hennessy and David A. Patterson
Morgan Kaufmann Publishers
ISBN  9780123704900

Handouts (for slides see below):

Slides

Preliminary set of treated topics.
** will be added and updated during the course period **

Project

As part of this course you have to perform 2 lab exercises, as described below. For each lab exercise a report has to be sent in, explaining the results obtained. Furthermore you have to perform a self-study on a topic relevant and stongly related to the material discussed in this course. See for details below.

Lab exercise 1: Single General-Purpose Processors

This exercise makes you familiar with the highly configurable SimpleScalar simulation platform. The SimpleScalar tool set is a system software infrastructure used to build modeling applications for program performance analysis, detailed microarchitectural modeling, and hardware-software co-verification. Using the SimpleScalar tools, users can build modeling applications that simulate real programs running on a range of modern processors and systems.  The tool set includes sample simulators ranging from a fast functional simulator to a detailed, dynamically scheduled processor model that supports non-blocking caches, speculative execution, and state-of-the-art branch prediction.
It can emulate e.g. ARM, Alpha and x86 instruction sets.

Follow carefully the instruction which can be found here. The instructions describe how to run the tools on our server, which you can access remotely. It is also possible to use SimpleScalar directly on your own Desktop or Laptop (under UNIX/Linux or Windows NT)

Lab exercise 2: Multi-Processors

The state-of-the-art CPUs contain dozens of cores on a single die. The trend of going multicore posts new challenges to both computer architects and programmers. In this assignment, we will try to tackle these challenges, from the view point of both computer architects and programmers. The purpose of this assignment is to
  1. Get an in-depth understanding of mainstream multi-core CPU architectures
  2. Learn how to develop parallel application on such architectures, and how to analyze the performance in a real environment.
In this assignment you are asked to parallelize an application on 2 or more cores, using OpenMP. The perfomance will be measured and analyzed using Vtune.
The following slides may give you a quick start:
Details about the assignment can be found here.

You have to send in a lab report about your results to Yifan He.

Self-study of relevant topic

Guidelines are as follows:

Examination

The examination will likely be oral.
The grade is based on your project results (and being able to explain and defend them) and the discussed theory (study all slides, book chapters and handouts).

Related material and other links



Back to homepage of Henk Corporaal