This course (HBCS102) is divided into two modules: Level "A" and Level "B" being the most advance course.
Level "A" is an introductory course on parallel programming with about 20% of the time devoted for CUDA programming. This level does not require any parallel computing knowledge. Only a Data structures level course is required. The course starts from C programming language, and covers the detail of Graphics card hardware ( GPU architecture, DRAM, PCIe, etc). Apart from these concepts we also cover elementary concepts in CUDA programming on Windows and Linux environment.
Learning Outcome: The course aims at making the trainee understand how to write a simple program for squaring of (say) first 10000 integers, and such other simple CUDA programs. In short the candidate learns how to write simple CUDA programs and understand basic hardware and software details, without bothering about the performance.
Level B is an advance course on CUDA programming. The training comprises of most of the features of CUDA 2.3.
Learning Outcome: The trainee learns Implementation and optimisation of various complex algorithms, such as reduction and Pre-fix scan
SYLLABUS
LEVEL A
Contents
1. Important C Concepts
C Basic
* Token
* First Program
* Variable(Data Type)
* Operator
Building C Program
* Compiling
* Running a Program
Control Flow
* if else
* Switch Case
* How to Use Loops
User Defined Data Type
* Array
* Pointer
* Dynamic Memory Allocation(malloc(),free()
Function
* Type of Function
* Structure of Function
* Calling Techniques of Function
Introduction of GPU
* GPU Architecture
* Type of Memory
* Difference between CPU and GPU
Getting Started With Cuda
Installation, Driver, Sdk, Toolkit, Basic Programming Concepts, Mode of Parallel programming
CUDA Programming Model, Kernel, Calling Kernel on Device, Compiling and running a CUDA Program
LEVEL B
LEVEL "B" discusses parallel programming concepts in detail giving specific focus on CUDA programming. Specifically you are exposed to the following special topics:
* Performance metrics - speed-up, utilization, efficiency
* Transparent Scalability
* Memory organization in CUDA, Discussion on Pinned memory, texture memory
* Error Handling
* CUDA events
* Models of Parallel Computation: SIMD (Single Instruction Multiple Data), MIMD (Multiple Instruction Multiple Data),
* GPU Compute Architecture
* Memory Optimization
* Occupancy,CUDA profiler, Debugger,
* Performance Guidelines.
* Implementation of fast Matrix Multiplication, SCAN (Pre-fix sum) and reduction algorithms . These algorithms are basic building blocks of many of the complex applications being developed today.
In short this complete course will expose you to almost all of the features of CUDA 2.3, making you ready to write your own applications. The course will end up in a project as well.
The trainees will also be given a Certificate, subject to the condition that they clear the test at the end of the training.
_____________________________________________________________________________
Target Audience:
Professionals, researchers and students with background in Mathematics, Computer Science, IT, Electrical, Electronics & Communications and similar fields can enrol for this course.
___________________________________________________________________
Prerequisites: