Course syllabus

Below you can find the course-PM for TIF345 / FYM345  "Advanced simulation and machine learning". The main resources for this course, including lecture notes, material for lab assignments as well as additional worked out examples, are maintained in this gitlab repository. To get access to this repo please create a gitLAB (not to be confused with gitHUB) account and send your user name to the course examiner (erhart@chalmers.se).

During the computer labs you can also use the jupyterhub that we set up for this course. Based on prior experience, the server capacity is, however, limited and performance will suffer during periods of high load. To handle such situations you can of course execute your (Python) scripts on any other available resource, including the workstations in the computer labs. In the latter case, you will have to install the required Python libraries locally yourself, e.g., via pip install <package>.

Course-PM

Contact details

Examiner: Paul Erhart (erhart@chalmers.se)
Lecturers: Andreas Ekström (andreas.ekstrom@chalmers.se), Arkady Gonoskov (arkady.gonoskov@physics.gu.se), Bernhard Mehlig (bernhard.mehlig@physics.gu.se)
TAs: Shahnawaz Ahmed, Eric Lindgren, Jakub Fojt, Viktor Martvall

Course purpose

The course covers a selection of machine learning algorithms and statistical methods for simulating physical systems. The course is based on a set of projects, which are accompanied by lectures, hand-ins and hands-on computer exercises. During the course, students will be exposed to advanced scientific research problems, with the aim to reproduce state-of-the-art scientific results. Students will use the Python programming language and relevant open-source libraries, and will learn to develop and structure computer codes for carrying out scientific and statistical data analyses.

Schedule

You can find the schedule for this course on TimeEdit.

Course literature

This course aims to provide an introduction to a quickly evolving research field at the intersection between the natural science, in particular physics, and computer/data science. As a result, we have found it difficult to find a single book that can serve as a satisfactory backbone for this course. We have therefore compiled general lecture notes that you can access both online and in PDF format. This is an evolving document that we continue to continue and plan to grow over the years. In addition we provide worked-out examples ("demos") in the form of iPython notebooks via the course gitlab repository.

The part on neural networks, which is taught by Bernard Mehlig, will be based on his book "Machine Learning with Neural Networks: An Introduction for Scientists and Engineers", which is available, e.g., via bokus.

In order to access any of these resources you need to have a gitlab account and we must have added you as a member to the gitlab group for this course. (That is being registered in Canvas is insufficient for this purpose!)

Course design

The course involves 14 lectures and 7 lab sessions in one study period (lp2). Both lectures and computer labs will be run on Campus. Please consult the schedule in TimeEdit (see link above) for rooms and times.

The course consists of three parts, each of which enters equally into the examination.

Course content
Topic Lecturers Examination (weight)
Foundations: Linear models, Gaussian processes Lectures 1-4, Labs 1-2 Andreas Ekström 1 hand-in (10 pts),
1 project (25 pts)
Applications: Linear models, Gaussian processes, Optimization Lectures 5-8, Labs 3-5 Paul Erhart 2 projects (17.5 pts each)
Neural networks Lectures 9-11 Bernhard Mehlig
Other ML techniques Lectures 12-14, Labs 6-7 Arkady Gonoskov 1 project (30 pts)

Pertinent course resources (lecture notes, hand-in assignments, worked out examples) are made available via this gitlab repository. To access this resource you need to set up a gitlab account and send your user name to the course examiner (erhart@chalmers.se). This will also give you access to the jupyterhub that we set up for this course.

During the course you will work on hand-ins as well as several projects. To carry out these tasks you will have to write and run Python code. To this end, we have set up a dedicated Jupyterhub server, which is preconfigured. To log in to the Jupyterhub server you need your gitlab credentials.

Every participant should work on the hand-in individually while for the projects we expect you to work in pairs. We encourage to find partners yourself. If for one reason or another that is challenging, please contact the course examiner (erhart@chalmers.se) and we will find a solution.

Examination form

The examination of this course is based on the hand-ins and the projects that you carry out during this course (see table above). Each task is awarded between 10 and 35 points (see table above) for a total of 100 points. If you miss the deadline but submit within a week (≤7 days after the deadline) your point total for that assignment will be scaled by a factor 0.8. If you submit even later your point total for that assignment will be scaled by 0.6. After you have received the first graded version of your hand-in or project report, you have one opportunity for revision. Any additional points will be rescaled by a factor of 0.6.

For example, if you submit your assignment five days late and you receive 15 out of 20 points, your score will be 0.8×15=12 points. If you decide to hand in a revision and you are awarded 2 out of the remaining 5 points, your point total for this assignment will be 0.8×15+0.6×2=13.2 points.

Your final grade will be assigned according to the following table

Grading scheme
Points Chalmers grade GU grade
80-100 5 VG
70-79 4 G
40-69 3 G
0-39 U IG

Information for each task will be published via the gitlab repository of this course. Completed assignments need to be submitted via Canvas. N.B.: There is no final exam for this course. Please use this LaTeX template for writing your reports.

Course content

Advanced simulations in the physical sciences can benefit from ML methods in multiple ways:

  • Uncertainty quantification via Bayesian inference
  • Representation of mathematical models via ML models, e.g., neural networks and Gaussian processes
  • Parametrization and selection of ML models via regression techniques

with the following subtopics

  • Dimensionality reduction and descriptors for physical systems
  • Bayesian inference and model selection
  • Generalized linear models including Gaussian processes
  • Advanced regression and regularization techniques
  • Neural networks

All of these aspects will be introduced and examined in the context of modelling in the physical sciences.

Learning objectives

  • critically examine the description of systems in the physical sciences by different mathematical models
  • rationalize the numerical representation of such models at multiple levels of sophistication
  • employ statistical inference and machine learning (ML) methods to evaluate and compare models
  • explain, using appropriate terminology, methods from ML and statistical inference
  • analyze data and write code in scientific and ethical fashion

Course representatives

  • EIsak Brundin isakbrundin@hotmail.com
  • David Hambraeus davham@student.chalmers.se
  • Björn Krook Willén bjorn.krook.willen@gmail.com
  • Oskar More Arvidsson oskar.more.arvidsson@live.com
  • Alicia Rey Alonso aliciare@chalmers.se

Course summary:

Date Details Due