Advanced Computer Architecture (5MD00)
Course
year:
2012-2013 (1th semester, 2nd quater)
Code : |
5MD00 (3 ECTS) /
5Z033 (4 ECTS)
|
Lecturers : |
Prof. dr. Henk Corporaal
Prof. dr. ir. R.H.J.M. Otten |
Email : |
H.Corporaal at tue.nl |
Phone : |
TU/e: +31-40-247 5195 or 3653
(secr. TU/e) / 5462 (office TU/e) |
Assistance: |
MSc. Yifan He: y.he at tue.nl
MSc. Dongrui She: d.she at tue.nl
Dr. ir. Sander Stuijk (for computer infrastructure):
s.stuijk at tue.nl
|
Prerequisets: |
Course in computer
architecture and processor design; e.g. Processor Design
5Z032 or Computation 5JJ70;
Programming experience in C, C++, or equivalent language |
News
- Jan 14: all slides are now online
- First
lab assignment is now online
- First lecture on Monday November 12, 2012
Information on the course:
Description and objectives
Studying the architecture, organization and use of the newest
general purpose (micro)processors currently on the market, and the
latest research developments in computer architecture. Architectures
exploiting instruction-level parallelism (ILP), thread-level and
task-level parallelism are treated. Starting from basic architecture
concepts we will end with discussing the latest commercial
processors (e.g., Pentium 4 multi-core, EPIC processors like
Itanium, and embedded processors such as the TriMedia), and academic
processors (like TRIPS).
This course also treats how processors can be combined in a
multiprocessing platform, e.g. by using a Network-on-Chip.
Interprocessor communication issues will be dealt with. Furthermore
new code generation techniques needed for exploiting ILP will be
treated. Special emphasis will be on quantifying design decisions in
terms of performance and cost.
The intention of the course is to give students the ability to
understand the design principles and operation of new
(multi-)processor architectures, and evaluate them both
qualitatively and quantitatively. Although we treat several
examples, the emphasis will be on architecture concepts.
Furthermore, the aim is to design, implement and test a
Network-on-Chip, by one or more student teams.
Topics:
Basic principles (like instruction set design), pipelining and its
consequences; VLIW (very long instruction word) architectures,
Superpipelined, Superscalar, SIMD (single instruction, multiple
data, used in vector and sub-wordparallel processors) and MIMD
(multiple instruction, multiple data) architectures; SMT
(Simultaneous Multi-Threading); Out-of-order and speculative
execution; Branch prediction; Data (value) prediction; Design of
advanced memory hierarchies; Memory coherency and consistency;
Multi-threading; Exploiting task-level and instruction-level
parallelism; Inter-processor communication models; Input and output;
Network Communication Architecture; and Networks-on-Chip.
|
Computer Architecture: A Quantitative Approach; 5th
ed.
John L. Hennessy and David A. Patterson
Morgan Kaufmann Publishers
ISBN 9780123704900
|
Handouts (for slides see below):
Slides
Preliminary set of treated topics.
** will be added and updated during the course period **
- Overview slides (including
preliminary schedule)
- Topic 1: Computer
Systems Overview
reading: ch1 of H&P and ch1 of
Multi-Core
Programming by Akhter and Jason Roberts
- Topic 2: Crash course on MIPS
- Topic 3: Instruction-Level
Parallel (ILP) Architectures
- Study chapter 3 of H&P
- Dynamic Scheduling, Speculation, Branch Prediction
- How much parallelism is there in applications?
- Topic 4: Software techniques to
improve ILP
- Topic 5: Data-Level Parallel
Architectures
GPU
part: guest lecture by Juan Gomez Luna (Cordoba Univ, Spain)
- Topic 6: SMT: Simultaneous
Multi-Threading
- Topic 7: Multi-processing
Project
As part of this course you have to perform 2 lab exercises, one on
the design of a single, superscalar, processor (using Simple Scalar
tool), and one on Multi-processor design. For each lab exercise a
report has to be sent in, explaining the results obtained.
Furthermore you have to perform a self-study on a topic relevant and
stongly related to the material discussed in this course.
Details will follow.
Lab exercise 1: Single General-Purpose Processors
See Lab-1 Introduction
slides.
In this Lab you will learn (almost) everything about Superscalar
processors, i.e. Out-of-Order, Multi-Issue processors, including a
complete memory hierarchy, containing several levels of caching.
For this you will use the highly configurable SimpleScalar simulation
platform. The SimpleScalar tool set is a system software
infrastructure used to perform program performance analysis,
detailed microarchitectural modeling, and hardware-software
co-verification. Using the SimpleScalar tools, you can simulate real
programs running on a range of modern processors and systems.
The tool set includes sample simulators ranging from a fast
functional simulator to a detailed, dynamically scheduled processor
model that supports non-blocking caches, speculative execution, and
state-of-the-art branch prediction.
It can emulate e.g. ARM, Alpha and x86 instruction sets.
Follow carefully the instruction which can be found here. The
instructions describe how to run the tools on our server, which you
can access remotely. It is also possible to use SimpleScalar
directly on your own Desktop or Laptop (under UNIX/Linux or Windows
NT).
Lab exercise 2: Multi-Processors
The state-of-the-art CPUs contain dozens of cores on a single die.
The trend of going multicore posts new challenges to both computer
architects and programmers. In this assignment, we will try to
tackle these challenges, from the view point of both computer
architects and programmers. The purpose of this assignment is to
- Get an in-depth understanding of
mainstream multi-core CPU architectures.
- Learn how to develop parallel
application on such architectures, and how to analyze the
performance in a real environment.
With the help of Gem5 full system simulator and McPAT modeling
framework, we will look at different configurations, e.g., the
number of processors, block-size and associativity of different
levels of caches. The goal is to optimize the
Energy-Delay-Area-Product (EDAP) of the system. You can achieve this
goal by improving the original C code, using OpenMP, and/or using
any other creative methods.
Details about the assignment can be found here.
You have to send in a lab report about your results to Yifan He and
Dongrui She
Self-study of relevant topic
Guidelines are as follows:
- Choose a hot topic
which interests you and which is highly related to this course.
- Select a technical research paper from the web, based on this
topic; each student has to read and review 1 paper.
- The paper should have sufficient
technical
depth; i.e. it should clearly explain all the details
of the proposed method or solution. You can also check whether
the paper is from well perceived journals or conferences, like
IEEE, or ACM conferences and journals (see e.g. IEEE.org, and ACM.org).
For a list of important journals and conferences look here. The paper should
be published in 2010 or later.
- The paper has to be presented with at most 8 slides during the
oral exam.
- Indicate on the last slide the strong but also the weak points
of the paper
Examination
The examination will likely be oral.
The grade is based on your project results (and being able to
explain and defend them) and the discussed theory (study all slides,
book chapters and handouts).
Related material and other links
- Reading material
- The 8-core 45 nm Intel Xeon processor: by Stefan Rusu e.a.
Nehalem-EX: check IEEE Asian Solic-State Circuits Conference
Nov 2009, Taipei
- PreMaDoNA project website.
Describes our MPSoc and Network-on-Chip activities. See also our
new NEST project.
- Slides about IA64 and Itanium:
- IA-64 Architecture Innovations by John Crawford and Jerry
Huck (ppt)
(IA-64 at the Intel Developers' Forum February '99)
- IA-64 Overview by David Fotland
(ppt)
(IA-64 at the IEEE Vail Computer Elements Workshop in June
'99)
- IA-64 Register Model: Stack and Rotation by Dale Morris (ppt)
(IA-64 at the IEEE Vail Computer Elements Workshop in June
'99)
- Compiling for IA-64 by Carol Thompson (ppt)
(IA-64 at the IEEE Vail Computer Elements Workshop in June
'99)
- Understanding the detailed Architecture of AMD's
64 bit Core by Hans de Vries
Back to homepage of Henk Corporaal