Code : 5LIL0
Credits: 5 ECTS Lecturer : Prof. dr. Henk Corporaal Tel. : +31-40-247 5195 (secr.) 5462 (office) Email: H.Corporaal at tue.nl Project assistance: Berk Ulker (b.ulker at tue.nl), Savvas
Sioutas (s.sioutas at tue.nl), Kanishkan Vadivel (k.vadivel at
tue.nl), and Martin Roa Villescas (m.roa.villescas at tue.nl) Material: check oncourse.tue.nl/2019
and below.
News
Oct 1: make a choice (on oncourse.tue.nl) between lab2
and lab3
Machine
learning and in particular deep learning has dramatically
improved the state-of-the-art in object detection, speech
recognition, robotics, and many other domains. Whether it is
superhuman performance in object recognition or beating human
players in Go, the astonishing success of deep learning is
achieved by deep neural networks trained with huge amounts of training
examples and massive
computing resources. Although already applied successfully
in academic use-cases and several consumer products (e.g.
machine translation), these data and computing requirements pose
challenges for further market penetration.
This course on Intelligent Architectures
first treats the most important Deep Learning Networks. In
particular we treat how they operate, their implementation, and
how they perform learning. We will use standard frameworks, like
Tensorflow or PyTorch, for building these networks.
These networks require lots of computation
and memory accesses, making them costly and very energy
consuming. Therefore this Intelligent Architectures course gives
an in-depth treatment of several Network and Implementation
optimizations steps, like network pruning, quantization and
loop-nest transformations for a drastic reducing of computation
and memory traffic requirements.
We also treat various processing and
accelerator platforms tuned for deep learning algorithms,
including (embedded) GPU, Tensor Processing Unit, and TTA
(Transport Triggered Architectures tuned for DNNs). Specific
hardware can lead to huge cost savings.
Finally we will look in the future, and hint
on what other high potential machine learning approaches can
offer, like Bayesian learning and Neuromorphic computing.
The course includes 2-3 lab assignments,
covering above topics. The labs give you real hand-on experience
on designing and implementing DNNs.
You will learn:
-understanding deep
learning, including network architectures, inference, and learning
methods.
-how to design Deep Neural
Networks (DNNs).
-how to implement and
optimize DNNs using various optimization methods.
-state-of-the-art DNNs,
including the newest type of operators.
-special processign
architectures and hardware efficiently supporting Deep Learning.
-alternative approaches to
the ''classical'' DNNs, like Bayesian learning and Neuromorphic
computing.
Topics:
The main emphasis is on Deep Learning, in particular on DNNs (Deep
Neural Networks), its algorithms, and its Efficient Implementation,
using custom and off-the-shelve processors and accelerators.
In this course we treat among others the following topics:
CNN: Convolutional Neural Networks
Learning principles
Frameworks for designing DNNs (Deep Neural Networks)
Optimizations
Compact DNNs
Quantization of activations and weights in DNNs
Advanced mapping of DNNs exploiting data reuse for
activations and weights
General architecture support for DNNs
DNN accelerators
Beyond the classic neural networks
Neuromorphic computing
Bayesian computing
Most of the topics will be supplemented by very elaborate hands-on
exercises.
For a preliminary lecture overview see: schedule.
Topic
10: General Architecture support for DNNs
Kanishkan Vadivel (TUe)
Topic 11:
Accelerators for DNNs, including a good overview of mapping DNNs
to accelerators, exploiting data reuse.
Kanishkan Vadivel (TU/e), Pekka Jaaskelainen (Tampere
University)
Note: for the row stationary mapping you should consult the
indicated slides from the Eyeriss - ISCA 2019 tutorial.
As part of this lecture you have to study a hot topic related to
this course, and make a short slide presentation about this topic. Details
will be announced during the lecture.
Guidelines are as follows:
Choose one hot topic
which interests you and which is highly related to this
course!
Select one technical (in depth) research paper from the web,
based on this topic.
The paper should be published in 2017 or later.
The paper should have
sufficient technical depth; i.e. it should clearly
explain all the details of the proposed method or solution. So
e.g. do not choose company white or business papers. You can
also check whether the paper is from well perceived journals or
conferences, like IEEE, or ACM conferences and journals
(see e.g. IEEE.org, and ACM.org).
PACT: parallel
architectures and compilation techniques:
www.eecg.toronto.edu/pact
A much larger list on computer architecture related
conferences and journals can be found here.
PRESENTATION:
You should make a powerpoint presentation on your topic; max 5 min. per presentation
(e.g. 5 slides;
one slide introducing the problem, then the approach and results
of each paper, and final conclusion and suggestions from your
side on this topic; add / use clear pictures to explain the approach)
The presentation should contain at least the following:
Summary of
the paper contributions (including technical details)
Your evaluation
of the paper and topic
strong points
weak points
applicability of proposed methodology / solution
indicate new / future directions of research
In order to evaluate the paper you may wish to read related material on the same
topic.
Your presentation will be evaluated by us. This evaluation
will be taken into account for the final grading.
Hands-on lab work
Becoming an expert in Deep Learning and Deep Neural Networks
requires that you get your hands dirty and make practical
assignments. Therefore, as part of this course we have put a lot of
effort to prepare 3 very interesting lab assignments. Details will
be presented during the course, and material will be placed on the
oncourse 5LIL0 site.
** labs will be put online during the course **
Hands-on 1:
DNN design
You will design a Deep Neural Network (DNN) using one of the
well-known frameworks. After learning the network will be tuned.
Hands-on 2: DNN
implementation on GPUs
Graphic processing units (GPUs) can contain upto thousands of
Processing Engines (PEs). They achieve performance levels of Tera
FLops (10^12 floating point operations per second). In the past GPUs
were very dedicated, not general programmable, and could only
be used to speedup graphics processing. Today, they become
more-and-more general purpose. Lately they also support Deep Neural
Networks (DNNs) by supporting smaller data sizes (e.g. Float 16-bit)
and having special units speeding up learning and inference.
In this lab you are asked to map a DNN efficiently on a GPU, using
all the tricks you can play.
Hands-on
3: DNN implementation on Embedded ASIP
In this lab we will map a Deep Neural Network (DNN) to an
Application Specific Instruction-set Processor (ASIP).
We will use the AivoTTA from Tampere University as a target
platform. You can tune the platform by adding specific function
units.
See the lab3 assignment.
Further files an details are on oncourse.tue.nl
Examination
The examination will be oral about the treated course theory, the
lab report(s), and studied articles.
Likely week: 4th week of January. We will discuss the dates
with you.
Grading depends on your results on theory, lab exercises and
defense, and your presentation.