Course syllabus

Course-PM

DAT400 / DIT431 DAT400 / DIT431 High-performance parallel programming lp1 HT25 (7.5 hp)

This course is offered by the department of Computer Science and Engineering

Course meetings

All lectures, problem sessions, workshops and labs will be held on campus. Check timeedit for the rooms.

Contact details

  • Miquel Pericàs <miquelp@chalmers.se> (Examiner + Lecturer)
  • Hari Abram <hariv@chalmers.se> (Teaching Assistant)
  • Sonia Rani Gupta <soniar@chalmers.se> (Teaching Assistant)
  • Jing Chen <chjing@chalmers.se> (Teaching Assistant)

Course purpose

This course looks at parallel programming models, efficient programming methodologies and performance tools with the objective of developing highly efficient parallel programs.

Workshop links

Zoom link for hybrid participation: https://chalmers.zoom.us/j/62921710276?pwd=UnaMXRtcbHabbhlX6UORkqez20KLNw.1 

Instructions for connecting remotely to lab machines: https://chalmers.topdesk.net/tas/public/ssp/content/detail/knowledgeitem?unid=304967f9ad004d3293b986a976e39833

Schedule

Check TimeEdit for rooms. The schedule is shown below. 

Note: This schedule is In construction / Subject to change. The first lecture will be on Tuesday, Sep 2nd, at 13:15h in room SB-H1

 

#
Time
Session
Topic
Book Correspondence
Responsible
1
Sep 2nd, 13:15-16:00
Lecture #1
Introduction + Basic concepts of Parallelism 
1.1, 1.2, 1.3
Miquel
2
Lecture #2 (online)
Intro to Parallel Computer Architecture
2.x 
3
Sep 4th, 13:15-16:00
Lecture #3
Parallel Programming Models (part 1)
3.1, 3.2, 3.3
Miquel
4
Sep 5th & Sep 9th, 8:00-11:45
Lab  #1
Intro to tools and environment
 
Hari, Sonia
5
Sep 9th, 13:15-16:00
Lecture #4
Parallel Programming Models (part 2)
3.4, 3.5, 3.6
Miquel
6
Sep 11th, 13:15-16:00
Lectures #5 and #6
Loops and scheduling + Performance Analysis and Roofline model
4.2, 4.6
Miquel
7
Sep 12th & Sep 16th, 8:00-11:45
Lab #2 
Program parallelization
Hari, Sonia
8
Sep 16th, 13:15-15:00
Exercise Session
Parallel Programming Models
 
Hari
9
Sep 18th, 13:15-15:00
Exercise Session
Performance Analysis and Roofline Model
 
Sonia
10
Sep 19th & Sep 23rd, 8:00-11:45
Lab #3 
Performance Analysis / Roofline Model CPUs
 
Hari, Sonia
11
Sep 23rd, 13:15-16:00
Workshop #1
OpenMP (session 1)
 
Miquel
12
Sep 25th, 13:15-16:00
Workshop #1
OpenMP (session 2)
 
Miquel
13
Sep 26th & Sep 30th, 8:00-11:45
Lab #4
OpenMP
 
Hari, Sonia
14
Sep 30th, 13:15-15:00
Workshop #2
Message Passing Interface (session 1)
 
Miquel
 
Sep 30th, 15:15-17
MPHPC lecture
Ethics Lecture
 
Richard Torkar
15
Oct 2nd, 13:15-16:00
Workshop #2
Message Passing Interface (session 2)
 
Miquel
16
Oct 3rd & Oct 7th, 8:00-11:45
Lab #5
Message Passing Interface
 
Hari, Sonia
17
Oct 7th, 13:15-16:00
Workshop #3
CUDA (session 1)
 
Miquel
18
Oct 9th, 13:15-16:00
Workshop #3
CUDA (session 2) 
 
Miquel
19
Oct 14th & Oct 17th, 8:00-11:45
Lab #6
CUDA
 
Hari, Sonia
20 Oct 14th, 13:15-14:00 Workshop #3 CUDA (session 3) 
 
Miquel
21 Oct 14th, 14:15-15:00 Wrap-up MPI + OpenMP + CUDA
 
Miquel
22 Oct 16th, 13:15-15:00 Buffer slot
 
23
Oct 21st & Oct 24th, 8:00-11:45
Lab extra sessions
Extra sessions to support labs
 
Hari, Sonia
24
Oct 21st, 13:15-15:00
Exam preparation 1
 
Miquel
25
Oct 23rd, 13:15-15:00
Exam preparation 2
 
Miquel
26
Oct 30th 8:30-12:30
Written Exam
 
 

Course literature

The theory part (part #1) of the course loosely follows the following book: "Parallel Programming for Multicore and Cluster Systems", Thomas Rauber and Gudula Rünger (3rd edition, 2023). This book can be accessed through the Chalmers Library: link to the coursebook (accessible via Chalmers library).

The practical part (part #2) which covers various programming models and libraries is based on several online resources that will be published at a later point 

Course design

The course consists of a set of lectures and laboratory sessions. The lectures start with an overview of parallel computer architectures and parallel programming models and paradigms. An important part of the discussion is mechanisms for synchronization and data exchange. Next, code transformations and performance analysis of parallel programs is covered. The course proceeds with a discussion of tools and techniques for developing parallel programs in shared address spaces. This section is based on two workshops that cover the OpenMP programming model. Next, the course discusses the development of parallel programs for distributed address space. Here, two workshops cover the Message Passing Interface (MPI). Finally, we discuss how to program GPU accelerators. This part consists of two workshops that describe the CUDA (Compute Unified Device Architecture) programming environment.

The lectures are complemented with a set of laboratory sessions in which participants explore the topics introduced in the lectures. During the lab sessions, participants parallelize sample programs over a variety of parallel architectures and use performance analysis tools to detect and remove bottlenecks in the parallel implementations of these programs. The lab sessions are done in teams of two. At the end of each session a joint report has to be submitted. There is no strict deadline for the report, but we strongly recommend to submit it before the beginning of the next lab session. 

Throughout the course, several assignments are proposed that provide bonus points. These assignments consist in reading papers and submitting solutions for proposed exercises. They are not mandatory, but they provide bonus points that are added to the score of the written exam given that the exam has reached a minimum score (this will be discussed in the first lecture). 

Changes made since the last occasion

No major changes are planned for this edition of the course.

 

Learning objectives and syllabus

Learning objectives:

Knowledge and Understanding 

  • List the different types of parallel computer architectures, programming models and paradigms, as well as different schemes for synchronization and communication. 
  • List the typical steps to parallelize a sequential algorithm 
  • List different methods for analysis methodologies of parallel program systems 
Competence and skills 
  • Apply performance analysis methodologies to determine the bottlenecks in the execution of a parallel program 
  • Predict the upper limit to the performance of a parallel program 
  • ability to cooperate in diverse group compositions with team members with different skills, cultural and educational backgrounds, gender and nationality
  • the student should be able to make and defend ethical judgement in general, and in particular within the area of High-Performance Computing systems
Judgment and approach 
  • Given a particular software, specify what performance bottlenecks are limiting the efficiency of parallel code and select appropriate strategies to overcome these bottlenecks 
  • Design energy-aware parallelization strategies based on a specific algorithms structure and computing system organization 
  • Argue which performance analysis methods are important given a specific context

 

Link to the syllabus on Studieportalen.

https://www.chalmers.se/en/education/your-studies/find-course-and-programme-syllabi/course-syllabus/DAT400/?acYear=2025%2F2026

https://www.gu.se/en/study-in-gothenburg/exchange-student/courses/dit431

Assessment

The exam (4.5c)

The final exam is in written form and accounts for 4.5 credits. Bonus points collected via course assignments can contribute to increase the score of the final exam. Bonus points will only added if your exam score reaches 18pts (out of 60)! The pass score in the final exam is 24/60 pt.

The labs (3.0c)

Successful completion of all the labs accounts for 3.0 credits. 

The final grade is the same grade as the exam. To be given a pass grade on the whole course, both components need to have a pass. 

Course summary:

Course Summary
Date Details Due