Lecture Slides
Lecture 1 Introduction Lecture 2 Introduction to CUDA C Part I Lecture 3 Introduction to CUDA C Part II Lecture 4 Kernel-Based Data Parallel Execution Model Lecture 4 Code Example Lectures 5 and 6 Memory Model and Locality Lectures 7 and 8 Performance Considerations and GMAC Lectures 8 and 9 DRAM Bandwidth Lecture 10 Tiled Convolution Analysis Lecture 11 Parallel Computation Patterns–Reduction Trees Part I Lecture 12 Parallel Computation Patterns–Reduction Trees Part II Lecture 13 Parallel Computation Patterns–Parallel Prefix Sum (Scan) Part I Lecture 14 Parallel Computation Patterns–Parallel Prefix Sum (Scan) Part II Lecture 15 Floating Point Considerations Part I Lecture 16 Floating Point Considerations Part II Lecture 17 Atomic Operations and Histogramming Part I Lecture 18 Final Project Kickoff Lecture 19 Atomic Operations and Histogramming Part II Lecture 20 GPU as Part of the PC Architecture Lecture 21 Data Transfer and CUDA Streams Lecture 22 Application Case Study–Advanced MRI Reconstruction Lecture 23 Application Case Study–Electrostatic Potential Calculation Part I Lecture 24 Application Case Study–Electrostatic Potential Calculation Part II Lecture 25 Generalization and Future Studies Lecture 26 Joint CUDA-MPI Programming Part I Lecture 27 Joint CUDA-MPI Programming Part II Lecture 28 Joint CUDA-MPI Programming Part III Lecture 29 Introduction to OpenCL Lecture 30 Introduction to OpenACC