Code : 5LIL0
Credits: 5 ECTS Lecturer : Prof. dr. Henk Corporaal + several Guest lecturers
(see the actual schedule) Tel. : +31-40-247 5195 (secr.) 5462 (office) Email: H.Corporaal at tue.nl Project assistance: Berk Ulker (b.ulker at tue.nl), Kanishkan
Vadivel (k.vadivel at tue.nl),
Martin Roa Villescas (m.roa.villescas at tue.nl),
Wei Sun (w.sun at tue.nl),
Ali Banagozar (a.banagozar at tue.nl), and Floran
de Putter (f.a.m.d.putter at tue.nl), Material: check canvas and
below. Previous year (2021): check here
News
Oral exam will be on Wednesday, April 20 / Thursday, April
21, 2022.
Machine
learning and in particular deep learning has dramatically
improved the state-of-the-art in object detection, speech
recognition, robotics, and many other domains. Whether it is
superhuman performance in object recognition or beating human
players in Go, the astonishing success of deep learning is
achieved by deep neural networks trained with huge amounts of training
examples and massive
computing resources. Although already applied successfully
in academic use-cases and several consumer products (e.g.
machine translation), these data and computing requirements pose
challenges for further market penetration.
This
course on Intelligent Architectures first treats the most
important Deep Learning Networks. In particular we treat how
they operate, their implementation, and how they perform
learning. We will use standard frameworks, like Tensorflow or
PyTorch, for building these networks.
These
networks require lots of computation and memory accesses,
making them costly and very energy consuming. Therefore this
Intelligent Architectures course gives an in-depth treatment
of several Network and Implementation optimizations steps,
like network pruning, quantization and loop-nest
transformations for a drastic reducing of computation and
memory traffic requirements.
We also
treat various processing and accelerator platforms tuned for
deep learning algorithms, including (embedded) GPU, Tensor
Processing Unit, and TTA (Transport Triggered Architectures)
tuned for Deep ANNs (Artificial Neural Networks). Tuning the
architecture and/or adding specific hardware can lead to huge
cost savings.
Finally
we will look in the future, and hint on what other high
potential machine learning approaches can offer, like Bayesian
learning and Neuromorphic computing.
The
course includes 3 lab assignments, covering above topics. The
labs give you real hand-on experience on designing and
implementing ANNs.
You will learn:
-understanding deep learning, including
network architectures, inference, and learning methods.
-how to design Deep Artificial Neural
Networks (ANNs).
-how to implement and optimize ANNs
using various optimization methods.
-state-of-the-art ANNs, including the
newest type of operators.
-special processign architectures and
hardware efficiently supporting Deep Learning.
-alternative approaches to the
''classical'' ANNs, like Bayesian learning and Neuromorphic (SNN
based) computing.
Topics:
The main emphasis is on Deep Learning, in particular on ANNs
(Artificial Neural Networks), its algorithms, and its Efficient
Implementation, using custom and off-the-shelve processors and
accelerators. Note often we talk about DNNs (Deep Neural Networks);
usually we refer then to Deep ANNs, and not Deep SNNs (Spiking
Neural Networks).
In this course we treat among others the following topics:
CNN: Convolutional Neural Networks
Learning principles
Frameworks for designing DNNs (Deep Neural Networks)
Optimizations
Compact DNNs
Quantization of activations and weights in DNNs
Advanced mapping of DNNs exploiting data reuse for
activations and weights
General architecture support for DNNs
DNN accelerators
Beyond the classic neural networks
Neuromorphic computing
Bayesian computing
Most of the topics will be supplemented by very elaborate hands-on
exercises.
For a preliminary lecture overview see: schedule.
As part of this lecture you have a bonus option: you have to study a
hot topic related to this course, and make a short slide
presentation about this topic. Details will be announced during
the lecture.
Guidelines are as follows:
Choose one hot topic
which interests you and which is highly related to this
course!
Select one technical (in depth) research paper from the web,
based on this topic. See lists below.
The paper should be published in 2019 or later.
The paper should have
sufficient technical depth; i.e. it should clearly
explain all the details of the proposed method or solution. So
e.g. do not choose company white or business papers. You can
also check whether the paper is from well perceived journals or
conferences, like IEEE, or ACM conferences and journals
(see e.g. IEEE.org, and ACM.org).
Check the top
conferences and top
journals on Machine Learning and Artificial
Intelligence.
You may also have a look at the following two lists:
Top architecture conferences, containing lots of Deep
Learning Architecture and Implementation papers:
ISCA: International Symposium on Computer
Architecture: iscaconf.org
IEEE MICRO: Symposium on Microprocessor
Architectures: www.microarch.org
ASPLOS:
Architectural support for languages and operating systems:
asplos-conferenc.org
ICS:
International Conference on Supercomputing:
www.ics-conference.org
ISSCC:
International Solid State Circuits Conference: isscc.org
DAC:
Design Automation Conference: www.dac.com
DATE:
Design Automation and Test in Europe: www.date-conference.com
CODES
(Hardware-Software Codesign) + ISSS (International Symposium
on System Synthesis): www.codes-isss.org
CASES:
Compilers, Architectures, and Synthesis for Embedded Systems:
www.casesconference.org
IEEE
MICRO: Symposium on Micro Arch: www.microarch.org
Top conferences on Machine Learning & Artificial
Intelligence, containing also Deep Learning Architecture
and Implementation papers:
NeurIPS: Neural Information
Processing Systems (NIPS)
ICML: International conference on
Machine Learning
CVPR : IEEE/CVF Conference on Computer Vision and Pattern
Recognition
ICCV :
IEEE/CVF International Conference on Computer Vision
ECCV :
European Conference on Computer Vision·
AAAI : AAAI Conference on Artificial Intelligence
A larger list on computer architecture related conferences and
journals can be found here.
PRESENTATION:
You should make a powerpoint presentation on your topic; max 5 min. per presentation
(e.g. 5 slides;
one slide introducing the problem, then the approach and results
of each paper, and final conclusion and suggestions from your
side on this topic; add / use clear pictures to explain the approach)
The presentation should contain at least the following:
Summary of
the paper contributions (including technical details)
Your evaluation
of the paper and topic
strong points
weak points
applicability of proposed methodology / solution
indicate new / future directions of research
In order to evaluate the paper you may wish to read related material on the same
topic.
Your presentation will be evaluated by us. This evaluation
will be taken into account for the final grading.
Hands-on lab work (will be
updated)
Becoming an expert in Deep Learning and Deep Neural Networks
requires that you get your hands dirty and make practical
assignments. Therefore, as part of this course we have put a lot of
effort to prepare 3 very interesting lab assignments. Details will
be presented during the course, and material will be placed on the
oncourse 5LIL0 site.
** labs will be put online during the course **
Hands-on 1:
DNN design
You will design a deep Convolutional Neural Network (CNN) using one
of the well-known frameworks: PyTorch. The network has to recognize
spoken words.
After learning the network will be tuned: pruning, quantization and
other optimizations.
Hands-on 2: DNN
implementation on GPUs
Graphic processing units (GPUs) can contain upto thousands of
Processing Engines (PEs). They achieve performance levels of Tera
FLops (10^12 floating point operations per second). In the past GPUs
were very dedicated, not general programmable, and could only
be used to speedup graphics processing. Today, they become
more-and-more general purpose. Lately they also support Deep Neural
Networks (DNNs) by supporting smaller data sizes (e.g. Float 16-bit)
and having special units speeding up learning and inference.
In this lab you are asked to map a DNN efficiently on a GPU, using
all the tricks you can play.
Hands-on
3: DNN implementation on Embedded ASIP
In this lab we will map a Deep Neural Network (DNN) to an
Application Specific Instruction-set Processor (ASIP).
We will use the AivoTTA from Tampere University as a target
platform. You can tune the platform by adding specific function
units.
See the lab3 assignment.
Further files an details are on oncourse.tue.nl
Examination
The examination will be oral about the treated course theory, the
lab report(s), and studied articles.
We will discuss the dates with you.
Grading depends on your results on theory, lab exercises and
defense, and your presentation.