Course syllabus
Course PM
This page contains the program of the course: lectures, exercise sessions and computer labs. Other information, such as learning outcomes, teachers, literature and examination, are in a separate course PM.
List of topics and potential exam questions: 2020-21_Topics.pdf
Minutes of the midcourse meeting: Minutes of the mid-course meeting2020-21.pdf
Program
The schedule of the course is in TimeEdit. However all lectures will consist of pre-recorded videos that will be posted in the table below in due time. This means that you can watch the videos when you find it more convenient. The Lecturer will strive to upload material 1 day before the official lecture times as found in TimeEdit, so that you can use part of the "lecture time" to ask questions instead, read more below.
On the other hand, we will have a single computer lab at 15.15-17.00 on 5 November which will be actually happening "live", on zoom. This lab is not structured with presentations of concepts, you will work by yourself and during the lab you can book zoom chats for questions & help with the Teaching Assistants (link to be added in the timetable below). The lab is useful to get you started with the R/Rstudio software. If you already have some experience with R you probably won't need it.
Notice, there are some activities for which attendance is mandatory. These are the Fridays "mini-analyses". See the course PM for details and what to do in case you are unable to attend.
There will generally be two Zoom meetings each week, of at most 30 minutes each, where we can all meet:
these will be during "lecture" time: at 9.30-10.00 on Tuesdays and 14.00-14.30 on Thursdays always at https://chalmers.zoom.us/my/picchini
- You will be able to ask questions related to the online learning material.
- You may ask me to review or clarify material (if you mail me questions on beforehand I can prepare, but this is not necessary).
- You may comment on or discuss course content or course organization.
- We can discuss whatever else we want to discuss in the whole group.
- You can definitely email me at picchini@chalmers.se. But you can also use the "Discussion" forum here on Canvas.
Finally, I will have "office hours", one hour each Wednesday 9.00-10.00 on Zoom (except the first week), where students can contact me individually on https://chalmers.zoom.us/my/picchini. The office hours work as follows: You enter the "waiting room" of my zoom office, and I admit one student at a time from that waiting room, on a first-come first-served basis. Each student should limit their question time to about 10 minutes, if there is a line. I will also try to answer questions posed here in Canvas when I see them.
Lectures (below is a plan based on last year. Deviations may occur).
Notice: regarding video lectures, you can (i) download them as MP4 files, by clicking on the file names appearing over the preview; (ii) alternatively you can stream those. I have noticed Canvas compresses the video in a way that, occasionally, some text might be slightly blurry. A downloaded version should be of higher quality.
week | Topics | video lectures |
slides/notes | code and data files |
MiniAnalysis |
---|---|---|---|---|---|
45 |
Tuesday: General intro to the course and mention of several topics: bias in linear regression, least squares; parameters interpretation in simple linear regression; the 5 basic assumptions.
|
(L1_4FIX fixes the last 10 minutes of L1_4) |
lab0.pdf (you must go through this file before the Thursday lab); Jörnsten's notes; slides_1.pdf; Check "lecture 1" in Jörnsten's notes. |
Lecture1.R | |
45 |
Thursday: derivation of least squares estimates. Proof of the unbiasedness; variance of the estimators; began residuals-based diagnostics
|
slides_2.pdf formula sheet for properties of expectation, variance etc Check "lecture 2" in Jörnsten's notes. |
Demo1-2020.R Lecture2.R sleeptab.dat |
no minianalysis presentations this week. |
|
45 |
Thursday Computer lab at 15.15: this lab is not structured with presentation of concepts. We are there to help if you have questions regarding exercises. Book your questions below. https://docs.google.com/spreadsheets/d/1FtR3TOk_ti1EKpu_JOyMucatGy-vxNToEul4xf4Lknc/edit#gid=0 |
|
lab1.pdf |
repair1.txt |
|
46 |
Tuesday: leverage values; deletion-based diagnostics; MSE; unbiasedness of yhat and var of yhat;
|
slides_3.pdf Check "lecture 3" in Jörnsten's notes. |
Demo1.R (updated!) Lecture2.R (updated!) |
|
|
46 |
Thursday: expectation and variance of residuals; standardised residuals; unbiasedness of MSE; proof variability decomposition (SSEr, SStor, SSRegr); Rsquared; t-test construction
Optional exercises for self-study from Rawling's book: exercise 1.1, exercise 1.4, relevant bits of exercise 1.9, ex. 1.10, ex. 1.16, |
slides_4.pdf Check "lecture 4" in Jörnsten's notes. |
Lecture3.R Demo3.R |
Friday *: present work for MiniAnalysis1.pdf bikesharing.csv TV.dat *typically 1 hour only
|
|
47 |
Tuesday: confidence intervals for parameters and for E(Y0). Prediction intervals for Ypred0. The Simpson's paradox and notation for multiple lin. regression |
slides_5.pdf Check "lecture 5" in Jörnsten's notes. |
Lecture5.R Demo4.R
|
no minianalysis presentation this week |
|
47 |
Thursday: properties of the estimators in multiple regression and sampling distributions. Confidence intervals, t-test and categorical covariates (not everything); problems with p-values and large datasets; Topics also found in chapter 9 ("Class variables") in Rawlings et al up to sect. 9.3.
Optional exercises for self-study from Rawling's book: exercise 3.5(part a and d), ex. 3.10, ex. 3.11, ex. 3.12 (we never do regression without intercept...but if you are interested...), ex 3.13 |
slides_6.pdf Check "lecture 10" in Jörnsten's notes. |
multipleregression.R (updated!) cola.dat auction.R auction.dat
|
no minianalysis presentation this week |
|
48 |
Tue: models with categorical and numerical covariates. Multicollinearity and VIF. Partial F test. |
slides_7.pdf Check "lecture 6" in Jörnsten's notes. |
Lecture6.R SA.dat multicollinear.R global+partial_Ftests.R |
|
|
48 |
Thu: greedy variables selection: backward search. Global F test. Bias/variance tradoff and the pMSE. Training and testing (not everything: until slide 37) |
slides_8.pdf (until slide #37) Check "lecture 7" in Jörnsten's notes. |
Lecture7.R
|
Friday *: present minianalysis 2 MiniAnalysis2-2020.pdf *typically 1 hour only
|
|
49 |
Tue: (from slides_8.pdf) PMSE with regsubsets. Then (slides_9.pdf) interactions, adj-Rsquared, AIC
|
(finished slides_8.pdf) slides_9.pdf regsubsets-categorical-covar.pdf (optional AkaikeEasyIntro.pdf) See also "lecture 11" in Jörnsten's notes. |
regsubsets-categorical.R(updated!) |
||
49 |
Thu: BIC; K-fold CV; LOOCV; hat-matrix; residuals (intro) |
slides_10.pdf (updated)
See also "lecture 8" in Jörnsten's notes. But notice there are typos there, as pointed in the slides. |
Lecture8.R rsquaredAICstep.R regsubsets-categorical-CV.R |
Friday *: present minianalysis 3 Mini3-2020.pdf *typically 1 hour only PROJECT: project20.pdf (new!) medinsur.csv bicyclist_counts.csv (new!) |
|
50 |
Tue: standardised residuals, studentised residuals; Cook's distance, DFBETAs; intro GLMs and the exponential family; Newton-Raphson; Poisson regression;
|
slides_10.pdf (updated and completed) slides_11.pdf See also "lecture 14" in Jörnsten's notes. (Additional support: Agresti's book chapters 4 and 14.4) |
leverage-residuals-cook.R f6data.txt poisson_nb.R f10.txt f10b.txt |
|
|
50 |
Thu: more on Poisson regression; confidence intervals for GLMs; asymptotic properties of the MLE; CI for predictions; Wald test; deviance; likelihood ratio test; Poisson + offset term up to slide 36 |
slides_12.pdf |
poissregr-awards.R poisson_sim.csv |
|
|
51 | Completed Poisson + offset. Negative binomial regression. Quick tour through GLM diagnostics | slides_13.pdf | shipdamage.R |
Computer labs and software
Software: We will use the statistical package R to analyze data, powered via the Rstudio interface. You will need to install both on your computer, see the instructions.
No previous knowledge of R is required. But you are encouraged to attend the lab on Thursday 5 November. No further computer lab will be given.
Some useful resources:
- R: there are lots (too many!) resources online to learn about R: here is just a possible one to get started, from Uni. Copenhagen http://r.sund.ku.dk/index.html (Links to an external site.)
If you are familiar to MATLAB or Python, the following may be useful:
- a MATLAB/R cheat sheet is at http://mathesaurus.sourceforge.net/octave-r.html (Links to an external site.)
- a MATLAB/Python/R cheat sheet: http://mathesaurus.sourceforge.net/matlab-python-xref.pdf (Links to an external site.)
If you already have a copy of R installed on your computer, please check that its version is >= 3.6.0. If it is older install a more recent one.
Course summary:
Date | Details | Due |
---|---|---|