This course (HBCS102) is divided into two modules: Level "A" and Level "B" being the most advance course.
Level "A" is an introductory course on parallel programming with about 40% of the time devoted for CUDA programming. This level does not require any parallel computing knowledge. Only a Data structures level course is required. The course starts from C programming language, and covers the detail of Graphics card hardware ( GPU architecture, DRAM, PCIe, etc). Apart from these concepts we also cover elementary concepts in parallel programming and CUDA including "computational thinking", Algorithms, and some discussion on shared memory usage. Programming on Windows and Linux environment is also taught. Linux introduction is included so that the students and professionals can get all the benefits of working on open source OS.
Learning Outcome: The course aims at making the trainee understand how to write simple program such as squaring of (say) first 10000 integers, and such other simple CUDA programs., and compile the same on Linux and Windows. In short the candidate learns how to write simple CUDA programs and understand basic hardware and software details, without bothering about the performance. Also the discussion on Algorithms builds a solid base for getting a head in the world of parallel programming.
Level B is an advance course on CUDA programming. The training comprises of most of the features of CUDA 2.3 and 3.0.
Learning Outcome: The trainee learns Implementation and various optimization of complex algorithms, such as reduction and Pre-fix Sum (Scan), Matrix Transpose. This level aims at making the candidates ready to develop their own applications. Please refer to the detailed syllabus below.
SYLLABUS
LEVEL A
Contents
1. Important C Concepts
C Basic
* Token
* First Program
* Variable(Data Type)
* Operator
Building C Program on Linux
* Linux basics
* Compiling
* Running a Program
Control Flow
* if else
* Switch Case
* How to Use Loops
User Defined Data Type
* Array
* Pointer
* Dynamic Memory Allocation: malloc(),free()
Function
* Type of Function
* Structure of Function
* Calling Techniques of Function
2-Parallel Algorithms and Computational Thinking
3-Introduction to GPU Hardware
* Modern GPU Architecture
* Type of Memory
* Difference between CPU and GPU
* PCI-Express Vs PCI
4- Getting Started With Cuda
Installation, Driver, Sdk, Toolkit, Basic Programming Concepts, Mode of Parallel programming
CUDA Programming Model, Kernel, Calling Kernel on Device, Compiling and running a CUDA Program
5- Shared Memory Usage in CUDA
LEVEL B
LEVEL "B" discusses parallel programming concepts in detail giving specific focus on CUDA programming. Specifically you are exposed to the following special topics:
* Performance metrics - speed-up, utilization, efficiency
* Transparent Scalability
* Memory organization in CUDA, Discussion on Pinned memory, texture memory & constant memory usage
* Error Handling
* CUDA events
* Models of Parallel Computation: SIMD (Single Instruction Multiple Data), MIMD (Multiple Instruction Multiple Data),
* GPU Compute Architecture
* Memory Optimization (Removing Bank Conflicts in Shared Memory, Partition camping, Global Memory Optimization)
* Using streams: Overlapping GPU and CPU tasks, Overlapping Computation with Memory Copy
* Atomic Operations and their limitations
* using occupancy calculator, CUDA profiler, Debugger
* Performance Guidelines
* CUBLASS & CUFFT usage
* Thrust Library & CUDA Data Parallel Primitives Library (CuDPP)
* Implementation of fast Matrix Multiplication, SCAN (Pre-fix sum) and reduction algorithms , matrix transpose . These algorithms are basic building blocks of many of the complex applications being developed today.
Each concept is explained with practical examples. In short this complete course will expose you to almost all of the features of CUDA 2.3, and 3.0 making you ready to write and optimise your own applications. The course will end up in a project as well.
The trainees will also be given a Certificate, subject to the condition that they clear the test at the end of the training.