Course syllabus

Note: All information provided on this page is preliminary until the course starts.

Course-PM

DAT570 / DIT762  Continuous optimization in data science lp2 HT23 (7.5 hp)

Course is offered by the department of Computer Science and Engineering

Contact details

For questions, please contact the teacher and examiner, Ashkan Panahi: 

ashkan.panahi@chalmers.se

 

Student Representatives

Amanda Levinsson   amanda.levinsson@telia.com  

Axel Qvarnström      axel.qvarnstrom@gmail.com 

Frida Sjögren             fridajohannalinnea@gmail.com 

 

Course purpose

Behind the success of the modern data science approaches, lies the power of continuous optimization algorithms. In this course, we discuss various aspects of these algorithms and the main methods for designing and improving them in data science.

The course discusses the main aspects of optimization problems/algorithms in data science, such as convergence, statistical learning theory and etc. We present different features of optimization problems such as convexity, smoothness, constraints, etc, and show how they lead to different design strategies. We also discuss challenges in data science by presenting real-world examples, and presenting main algorithmic ideas to address them.

 

Schedule

There will be two lecture sessions and one exercise/consultation session every week. Please check TimeEdit for more details:

TimeEdit

 

Course literature

No mandatory literature.  Reference literature, further reading, and other non-mandatory texts are given on the page of each module.

Course design

In the course, we will cover both theoretical subjects and practical issues involved in data science. The students are only expected to have basic math and programming skills. The course is divided to 7 weeks, where each week is about a particular subject. There will be two lecture sessions each week (typically Tuesdays and Thursdays at 13:15), and an additional last session, typically on Fridays for consultation and review of the assignments, as well as special topics per request (see below). The list of the subjects is as follows:

Week1: Continuous optimization and statistics, linear programming, and convex analysis.

Week2: Convex Vs Nonconvex optimization, Lagrangian duality and introduction to optimization algorithms

Week3: Constrained optimization algorithms: projected gradient, primal-dual algorithms and interior point methods

Week4: Non-smooth optimization algorithms: sub-gradient descent, splitting and proximal operators

Week5: Accelerated algorithms: Heavy-ball method,  Nesterov acceleration and FISTA

Week6: Stochastic optimization algorithms: stochastic gradient descent, stochastic proximal-projection, stochastic variance reduction and stochastic averaging

Week7: Distributed optimization algorithms: Distributed splitting (ADMM), Gradient tracking, directed graphs   

 

Assignments:

There will be one assignment per week covering a topic in data science related to the subject of the week. The assignment typically include few written questions (could be deriving a step in an algorithm or discussing a concept in the classroom) and  programming projects.  A preliminary list of topics for assignments is as follows:

 

Assignment 1: Maximum Likelihood in data science: Classification and Regression

Assignment 2: EM algorithm in data science: Clustering and Autoencoders

Assignment 3: Projection operators and constrained linear regression

Assignment4: Sparsity-based methods in data science

Assignment5: Momentum method in Neural Networks: A simple transformer

Assignment6: Optimal transport with stochastic optimization

Assignment 7: A simple Federated learning scheme 

 

General Instructions for Assignments:

The assignments will be in groups of 2 students. Find a partner and register your group in the "People" tab. You only need to register your group once during module 1. Read the instruction pdf file of each assignment carefully. There can be errors/typos in them, so if you are unsure, contact the teacher as soon as you can. Specific instructions for submissions is given on the assignment pages, but generally you need to submit a pdf file with your solutions for questions and reports of your coding projects, as well as your codes, preferably as a Python notebook. It is your responsibility to make sure that the answers are clear and the code is commented enough so that they can be easily graded.

Only one person needs to submit on behalf of both students, but the front page of your submission must include the name of both students. Each assignment is due on Thursday of the week following the corresponding module (THIS RULE IS CHANGED!: The deadlines from module 2 is on Mondays). This means the following schedule:

Module 1: November 10, 23:59

Module 2: November 20, 23:59

Module 3: November 27, 23:59

Module 4: November 4, 23:59

Module 5: December 11, 23:59

Module 6: December 18, 23:59

Module 7: December 25, 23:59

Each assignment will be available as soon as possible, but definitely before the first session of the corresponding module (on Tuesdays).  

Python is the preferable language for assignment, but there can be some alternative arrangements per request.

 

Grading

Each assignment is graded out of 100 and the same grade will be given to both members, except if there is a complaint by one of the members about the contribution of the other one. The details of the grading (points) are given in the instruction pdf. In order to pass the assignments you need to receive at least 40 points. If the grade is below 40, the teacher may ask for resubmission with instructions for improvement and new deadlines.

 

Late Submission:

You can request for an extension of one extra week, but in that case, your grade will be multiplied to 0.8 (i.e. the maximum will be 80). Under exceptional conditions, full grade (i.e. out of 100) can be granted with the extension.

 

Sessions on Fridays:

The third session of each week (usually on Fridays) has multiple purposes. You can attend this session for consultation about the assignment that you are currently working on. After consultation, we will discuss the assignment that was due the day before. The answers will not be revealed (due to the possibility of late submissions), but we may discuss other aspects of the assignment. Finally, you are encouraged to provide a suggestion for "special topics" of your interest. In the last part of the sessions, we will discuss these topics per request. There will be a thread on the discussions, where you can suggest a topic and which module it might be more suitable for.

 

Examination form

There will not be any final exam. To pass the course, you need to pass every assignment (i.e. receive at least 40 points in every assignment). The reported grade is as follows:

Sufficient (3/G/40 )
Good (4/G/60)
Very Good (5/VG/80)

 

Note: The grades shown on canvas might be initially different due to e.g. automatic normalization.

Re-examination: If you do not pass the course, there will be reexams in the written format.

 

Learning objectives and syllabus

Learning objectives:

Knowledge and understanding:

  • describe different types of optimization problems, such as continuous/discrete/mixed, convex/nonconvex
  • explain what types of data science problems can be addressed by optimization problems
  • explain the main principles of different optimization algorithms and their global/local convergence
  • account for the computational complexity of optimization algorithms in data science, as well as their performance

Skills and abilities:

  • implement various optimization algorithms as computer programs,
  • apply and adapt optimization algorithms to data science problems, such as machine learning and rule-based approaches
  • find approximate solutions to computationally hard problems
  • formulate various problems in data science as mathematical optimization problems

Judgement ability and approach:

  • reason about what type of information or features of the data could be useful in selecting optimization algorithms
  • select the appropriate evaluation methodology including performance and convergence analysis

 

Link to the syllabus on Studieportalen.

Study plan

Course summary:

Date Details Due