Course syllabus

Course-PM

DAT400 High-performance parallel programming lp1 HT19 (7.5 hp)

Course is offered by the department of Computer Science and Engineering

Contact details

  • Miquel Pericas <miquelp@chalmers.se> (Examiner and Lecturer)
  • Mustafa Abduljabbar <musabdu@chalmers.se> (Lecturer)
  • Jing Chen <chjing@chalmers.se> (Teaching Assistant)
  • Pirah Noor Soomro <pirah@chalmers.se> (Teaching Assistant)

 

Course purpose

This course looks at parallel programming models, efficient programming methodologies and performance tools with the objective of developing highly efficient parallel programs.

Lecture rooms:

See: TimeEdit

Course literature

Course book: "Parallel Programming for Multicore and Cluster Systems", Thomas Rauber and Gudula Rünger (2nd edition, 2013). The course book can be accessed through the Chalmers Library: link to course book

For the CUDA part we use a different book: "Programming Massively Parallel Processors : A Hands-on Approach", David Kirk and Wen-mei Hwu (2nd edition, 2013). This book is also accessible via the Chalmers Library: link to CUDA book

Course design

The course consists of a set of lectures and laboratory sessions. The lectures start with an overview of parallel computer architectures and parallel programming models and paradigms. An important part of the discussion are mechanisms for synchronization and data exchange. Next, performance analysis of parallel programs is covered. The course proceeds with a discussion of tools and techniques for developing parallel programs in shared address spaces. This section covers popular programming environments such as pthreads and OpenMP. Next the course discusses the development of parallel programs for distributed address space. The focus in this part is on the Message Passing Interface (MPI). Finally, we discuss programming approaches for executing applications on accelerators such as GPUs. This part introduces the CUDA (Compute Unified Device Architecture) programming environment.

The lectures are complemented with a set of laboratory sessions in which participants explore the topics introduced in the lectures. During the lab sessions, participants parallelize sample programs over a variety of parallel architectures, and use performance analysis tools to detect and remove bottlenecks in the parallel implementations of the programs.

 

Learning objectives and syllabus

Learning objectives:

Knowledge and Understanding

  • List the different types of parallel computer architectures, programming models and paradigms, as well as different schemes for synchronization and communication.
  • List the typical steps to parallelize a sequential algorithm
  • List different methods for analysis methodologies of parallel program systems

Competence and skills

  • Apply performance analysis methodologies to determine the bottlenecks in the execution of a parallel program
  • Predict the upper limit to the performance of a parallel program

Judgment and approach

  • Given a particular software, specify what performance bottlenecks are limiting the efficiency of parallel code and select appropriate strategies to overcome these bottlenecks
  • Design energy-aware parallelization strategies based on a specific algorithms structure and computing system organization
  • Argue which performance analysis methods are important given a specific context

Link to the syllabus on Studieportalen.

Study plan

 

Course summary:

Date Details Due