Course syllabus

Course-PM

DAT470 / DIT066 DAT470 / DIT066 Computational techniques for large-scale data lp4 VT25 (7.5 hp)

Course is offered by the department of Computer Science and Engineering

Contact details

Course purpose

The advent of big-data has led to the development of new programming paradigms, in particular for parallel systems allowing the computation with big data on redundant clusters of commodity computers. This course provides an introduction to different programming paradigms, e.g. MapReduce and extensions, which facilitate computations with Terabytes of data. It also demonstrates that for specific tasks algorithms and data structures can provide highly efficient alternatives.   

Schedule

TimeEdit

Course literature

Literature will be given during the course.

Course design

The course consists of following activities:

  • Lectures (two lectures a week, attendance not mandatory)
  • Lab sessions (three opportunities a week, attendance not mandatory)
  • Mandatory assignments (seven in total, done in pairs, returned through Canvas)
  • Quizzes (optional, earns bonus points for the exam)

To pass the course, you must

  • Pass the mandatory assignments
  • Pass the exam

Lab sessions consist of independent work on the assignments with the TAs there to help you.

You are expected to read the reading content before lectures, the content will not be reiterated at lectures.

Assignments require you to implement and evaluate algorithms in a practical setting on the teaching cluster using Python language.

Assignments are published weekly, and you have 2–4 weeks to work on one assignment. This means there are multiple parallel assignments running simultaneously. Assignments submitted by the deadline are graded within a week after the deadline.

Resubmissions are possible until 2025-06-08. Resubmissions are graded during or after the June re-exams.

Changes made since the last occasion

Assignment grading scale changed for GU students (Chalmers unfortunately still pass/fail due to a desynchronized courseplan).

Learning objectives and syllabus

Learning objectives:

After completion of the course the student should be able to:


Learning objectives
  • discuss important technological aspects when designing and implementing analysis solutions for large-scale data,
  • explain differences between parallel programming models
  • describe data structures and algorithms for big data and discuss their utility
 
Skills and abilities
  • implement applications for transforming and analyzing large-scale data with different parallel software frameworks,
  • use algorithms and datastructures for computations with large-scale data
 
Judgement ability and approach
  • suggest appropriate computational infrastructures and methodological approaches for analysis tasks and discuss their advantages and drawbacks,
  • discuss advantages and drawbacks of different strategies of parallelization,
  • decide between algorithmic and parallelization-based approaches for accelerating computational workloads
 

 

Link to the syllabus on Studieportalen.

https://www.chalmers.se/en/education/your-studies/find-course-and-programme-syllabi/course-syllabus/DAT470 / DIT066/?acYear=2025/2026

If the course is a joint course (Chalmers and Göteborgs Universitet) you should link to both syllabus (Chalmers and Göteborgs Universitet).

Examination form

Assignments are returned in PDF + Python form on Canvas. The grading of assignments differs between GU and Chalmers students: GU students get U, 3, 4, 5, and Chalmers pass/fail.

The written exam is graded on the ordinary scale U, 3, 4, 5.

It is possible to obtain up to 3 bonus points to the exam by answering lecture-specific quizzes on Canvas.

The course grade is weighted average (GU) or the exam grade (Chalmers).

Course summary:

Date Details Due