Advanced Computer Architecture (5MD00)
Course
year:
2014-2015 (1th semester, 2nd quarter)
Code : |
5MD00 (3 ECTS) /
5Z033 (4 ECTS)
|
Lecturer : |
Prof. dr. Henk Corporaal
|
Email : |
H.Corporaal at tue.nl |
Phone : |
TU/e: +31-40-247 5195 or 3653
(secr. TU/e) / 5462 (office TU/e) |
Assistance: |
MSc. Yifan He (y.he at
tue.nl)
Luc Waeijen (L.J.W.Waeijen at tue.nl), Mark Wijtvliet
(M.Wijtvliet at tue.nl)
Dr. ir. Sander Stuijk (for computer infrastructure):
s.stuijk at tue.nl
|
Prerequisets: |
Course in computer
architecture and processor design; e.g. Processor Design
5Z032 or Computation 5JJ70;
Programming experience in C, C++, or equivalent language |
News
- December 22: added slides about the 2 lab assignments.
- Good references on Superscalar Processors:
- Processor
Microarcitecture, by Antonio Gonzalez, Fernando Latorre
and Grigorios Magkis
Morgan & Claypool, 2011 (Highly recommended reading)
- Superscalar Microprocessor Design, Mike Johnson, book 1991,
or thesis
1989
- December 10: updated slides on Chapter 3 (Topic 4)
- First lab
exercise about design space exploration of a superscalar
processor and memory hierarchy is online. Follow the detailed
lab instructions.
- First lecture on Monday November 11, 2014
Information on the course:
Description and objectives
Studying the architecture, organization and use of the newest
general purpose (micro)processors currently on the market, and the
latest research developments in computer architecture. Architectures
exploiting instruction-level parallelism (ILP), data-level
parallelism (DLP), thread-level and task-level parallelism are
treated. Starting from basic architecture concepts we will end with
discussing the latest commercial processors (e.g., Intel Xeon, EPIC
processors like Itanium, AMD), and academic processors (like TRIPS).
This course also treats how processors can be combined in a
multiprocessing platform, e.g. by using a Network-on-Chip.
Interprocessor communication issues will be dealt with. Furthermore
new code generation techniques needed for exploiting ILP will be
treated. Special emphasis will be on quantifying design decisions in
terms of performance and cost.
The intention of the course is to give students the ability to
understand the design principles and operation of new
(multi-)processor architectures, and evaluate them both
qualitatively and quantitatively. Although we treat several
examples, the emphasis will be on architecture concepts.
Furthermore, two intensive lab exercises are part of course; they
will learn you the design space of single and multi-core
superscalar processors.
Topics:
Basic principles (like instruction set design), pipelining and its
consequences; VLIW (very long instruction word) architectures,
Superpipelined, Superscalar, SIMD (single instruction, multiple
data, used in vector and sub-wordparallel processors) and MIMD
(multiple instruction, multiple data) architectures; SMT
(Simultaneous Multi-Threading); Out-of-order and speculative
execution; Branch prediction; Data (value) prediction; Design of
advanced memory hierarchies; Memory coherency and consistency;
Multi-threading; Exploiting task-level and instruction-level
parallelism; Inter-processor communication models; Input and output;
Network Communication Architecture; and Networks-on-Chip.
|
Computer Architecture: A Quantitative Approach
5th edition
John L. Hennessy and David A. Patterson
Morgan Kaufmann Publishers
ISBN 9780123704900
|
This course covers the first 5 chapters of above book on Computer
Architecture. In addition an introduction to RISC processors is
given, in particular of the MIPS processor.
Handouts (for slides see below):
Slides
** Not all slides are up-to-date yet; they will
be updated during the course **
Project
As part of this course you have to perform 2 lab exercises, one on
the design of a single, superscalar, processor (using Simple Scalar
tool), and one on Multi-processor design. For each lab exercise a
report has to be sent in, explaining the results obtained.
Furthermore you have to perform a self-study on a topic relevant and
stongly related to the material discussed in this course.
** Labs will be updated during the course **
Lab exercise 1: Single General-Purpose Processors
See Lab-1 Introduction
slides.
In this Lab you will learn (almost) everything about Superscalar
processors, i.e. Out-of-Order, Multi-Issue processors, including a
complete memory hierarchy, containing several levels of caching.
For this you will use the highly configurable SimpleScalar simulation
platform. The SimpleScalar tool set is a system software
infrastructure used to perform program performance analysis,
detailed microarchitectural modeling, and hardware-software
co-verification. Using the SimpleScalar tools, you can simulate real
programs running on a range of modern processors and systems.
The tool set includes sample simulators ranging from a fast
functional simulator to a detailed, dynamically scheduled processor
model that supports non-blocking caches, speculative execution, and
state-of-the-art branch prediction.
It can emulate e.g. ARM, Alpha and x86 instruction sets.
Follow carefully the instruction which can be found here.
The instructions describe how to run the tools on our server, which
you can access remotely. It is also possible to use SimpleScalar
directly on your own Desktop or Laptop (under UNIX/Linux or Windows
NT).
Lab exercise 2: Multi-Processors
The state-of-the-art CPUs contain dozens of cores on a single die.
The trend of going multicore posts new challenges to both computer
architects and programmers. In this assignment, we will try to
tackle these challenges, from the view point of both computer
architects and programmers. The purpose of this assignment is to
- Get an in-depth understanding of
mainstream multi-core CPU architectures.
- Learn how to develop parallel
application on such architectures, and how to analyze the
performance in a real environment.
In this lab assignment, you will be asked to map a C
program onto a multiprocessor system. With the help of Snipersim
simulator, we will look at different configurations, e.g., the
number of processors, block-size and associativity of
different levels of caches. The goal is to optimize the Energy-Delay-Area-Product (EDAP) of
the system. You
can achieve this goal by
improving
the original C code, using OpenMP, and/or using
any other creative methods.
Details about the assignment can be found here.
You have to send in a lab report about your results to Yifan He and
Dongrui She
Self-study of relevant topic
Guidelines are as follows:
- Choose a hot topic
which interests you and which is highly related to this course.
- Select a technical research paper from the web, based on this
topic; each student has to read and review 1 paper.
- The paper should have sufficient
technical
depth; i.e. it should clearly explain all the details
of the proposed method or solution. You can also check whether
the paper is from well perceived journals or conferences, like
IEEE, or ACM conferences and journals (see e.g. IEEE.org, and ACM.org).
For a list of important journals and conferences look here. The paper should
be published in 2012 or later.
- The paper has to be presented with at most 8 slides during the
oral exam.
- Indicate on the last slide the strong but also the weak points
of the paper
Examination
The examination will likely be oral.
The grade is based on your project results (and being able to
explain and defend them) and the discussed theory (study all slides,
book chapters and handouts).
Related material and other links
- Processor
Microarcitecture, by Antonio Gonzalez, Fernando Latorre
and Grigorios Magkis
Morgan & Claypool, 2011 (Highly recommended reading)
- Superscalar Microprocessor Design, Mike Johnson, book 1991, or
thesis
1989
- Reading material
- The 8-core 45 nm Intel Xeon processor: by Stefan Rusu e.a.
Nehalem-EX: check IEEE Asian Solic-State Circuits Conference
Nov 2009, Taipei
- PreMaDoNA project website.
Describes our MPSoC and Network-on-Chip activities. See also our
new NEST project.
- Slides about IA64 and Itanium:
- IA-64 Architecture Innovations by John Crawford and Jerry
Huck (ppt)
(IA-64 at the Intel Developers' Forum February '99)
- IA-64 Overview by David Fotland
(ppt)
(IA-64 at the IEEE Vail Computer Elements Workshop in June
'99)
- IA-64 Register Model: Stack and Rotation by Dale Morris (ppt)
(IA-64 at the IEEE Vail Computer Elements Workshop in June
'99)
- Compiling for IA-64 by Carol Thompson (ppt)
(IA-64 at the IEEE Vail Computer Elements Workshop in June
'99)
- Understanding the detailed Architecture of AMD's
64 bit Core by Hans de Vries
Back to homepage of Henk Corporaal