MVE190 / MSG500 Linear statistical models
Course PM
This page contains the program of the course: lectures, exercise sessions and computer labs. Other information, such as learning outcomes, teachers, literature and examination, are in a separate course PM.
List of topics and potential exam questions: 2021-22_Topics and hence potential exam questions.pdf
Midcourse meeting notes: Minutes of the mid-course meeting 2021.pdf
Program
The schedule of the course is in TimeEdit. The course will take place entirely in-person in University rooms, NOT via zoom (with the exception of the compulsory mini-analyses presentations which can be attended in person or via zoom).
The course will be given in a hybrid fashion that is both in "in-person" in University rooms and also also live-streamed on zoom. Links are given below. Lectures will be recorded, and recordings will be posted on this page, however students should not count on these recordings as a reliable substitute to in-person attendance (the lecturer will use the blackboard in addition to slides, but may not make use of directional cameras in the rooms, therefore some blackboard writing will likely not be easily visible). Therefore in-person attendance is recommended.
We will have a single non-compulsory computer lab at 15.15-17.00 on 4 November which will NOT be via zoom. This lab is not structured with presentations of concepts. The lab is useful to get you started with the R/Rstudio software if you have no previous experience. If you already have some experience with R you probably won't need it.
Notice, there are some activities for which attendance is mandatory either in room or via zoom. These are the Fridays "mini-analyses". See the course PM for details and what to do in case you are unable to attend.
Lectures (below is a plan based on last year. Deviations may occur).
Notice: regarding video lectures, you can (i) download them as MP4 files, by pointing the mouse over the video preview, then right button click "Save as"; (ii) alternatively you can stream those. I have noticed Canvas compresses the video in a way that, occasionally, some text might be slightly blurry. A downloaded version should be of higher quality.
week | Topics | zoom links and video lectures |
slides/notes | code and data files |
MiniAnalysis |
---|---|---|---|---|---|
44 |
Tuesday: General intro to the course and mention of several topics: bias in linear regression, least squares; parameters interpretation in simple linear regression; the 5 basic assumptions.
|
lab0.pdf (you must go through this file before the Thursday lab). slides_1.pdf Also, check Check "lecture 1" in Jörnsten's notes. |
Lecture1.R |
||
44 |
Thursday: derivation of least squares estimates. Proof of the unbiasedness; variance of the estimators;
|
formula sheet for properties of expectation, variance etc slides_2.pdf slides_2_annotated.pdf Check "lecture 2" in Jörnsten's notes (linked in the previous lecture). |
Demo1-2021.R Lecture2.R sleeptab.dat |
no minianalysis presentations this week. |
|
44 |
Thursday Computer lab at 15.15-17.00 in MVF24 and MVF25: this lab is not structured with presentation of concepts. We are there to help if you have questions regarding exercises.
|
NO ZOOM (this is in-person) |
lab1.pdf |
repair1.txt |
|
45 |
Tuesday: some residuals-based diagnostics; box-cox transformations; leverage values; deletion-based diagnostics; MSE;
|
slides_3.pdf slides_3_annotated.pdf Box-Cox transf. see section 12.4 in Rawling's book. Check "lecture 3" in Jörnsten's notes. |
Demo1-2021.R (updated) boxcox-type_transforms.R |
|
|
45 |
Thursday: unbiasedness of yhat; expectation and variance of residuals; standardised residuals; unbiasedness of MSE; proof variability decomposition (SSEr, SStor, SSRegr); Rsquared; t-test construction
Optional exercises for self-study from Rawling's book: exercise 1.1, exercise 1.4, relevant bits of exercise 1.9, ex. 1.10, ex. 1.16, |
slides_4.pdf slides_4_annotated.pdf Check "lecture 4" in Jörnsten's notes. |
Demo3.R Lecture3.R |
Friday 12 Nov*: present work for minianalysis1 *typically 1 hour only MiniAnalysis1-2021.pdf AirBnB_NYCity_2019.csv TV.dat zoom (in-room presence is recommended) |
|
46 |
Tuesday: confidence intervals for parameters and for E(Y0). Prediction intervals for Ypred0. The Simpson's paradox and notation for multiple lin. regression |
slides_5.pdf slides_5_annotated.pdf (see also the relevant bits in lectures 4-5 of Jörsten's notes) |
Demo4.R Advertising.csv advertising.R Lecture4.R in this file (there are some bits we didn't do yet, such as the F-test) |
no minianalysis presentation this week |
|
46 |
Thursday: properties of the estimators in multiple regression and sampling distributions. Confidence intervals, t-test and categorical covariates (not everything); problems with p-values and large datasets; Topics also found in chapter 9 ("Class variables") in Rawlings et al up to sect. 9.3.
Optional exercises for self-study from Rawling's book: exercise 3.5(part a and d), ex. 3.10, ex. 3.11, ex. 3.12 (we never do regression without intercept...but if you are interested...), ex 3.13 |
|
slides_6.pdf slides_6_annotated.pdf Check "lecture 10" in Jörnsten's notes. |
multipleregression.R auction.R auction.dat
|
no minianalysis presentation this week |
47 |
Tue: models with categorical and numerical covariates. Multicollinearity and VIF. Partial F test. |
|
slides_7.pdf slides_7_annotated.pdf Check "lecture 6" in Jörnsten's notes. |
auction.R (updated!) Lecture6.R SA.dat multicollinear.R |
|
47 |
Thu: greedy variables selection: backward search. Global F test. Bias/variance tradoff and the pMSE. Training and testing |
|
slides_8.pdf slides_8_annotated.pdf Check "lecture 7" in Jörnsten's notes. |
Lecture7.R |
Friday 26 Nov*: present minianalysis 2 *typically 1 hour only MiniAnalysis2-2021.pdf https://chalmers.zoom.us/j/63885730848 |
48 |
Tue: (from slides_8.pdf) PMSE with regsubsets. Then (slides_9.pdf), Mallow's Cp, interactions, adj-Rsquared,
|
The above it's only the first 45 minutes, sorry! :( I am going to upload a recording from last year too, to compensate... Below is a chunk from last year.
And the last chunk from 2020
|
slides_9.pdf slides_9_annotated.pdf (optional if you are interested) AkaikeEasyIntro.pdf See also "lecture 11" in Jörnsten's notes. |
regsubsets-olsrr-categorical.R regsubsets-categorical-covar.pdf |
|
48 |
Thu: Kullback-Leibler; AIC, BIC; K-fold CV; LOOCV; |
slides_10.pdf slides_annotated_10.pdf |
Lecture8.R rsquaredAICstep.R regsubsets-categorical-CV.R |
Friday 3 Dec*: present minianalysis 3 MiniAnalysis3-2021.pdf regsubsets-categorical-covar.pdf regsubsets-olsrr-categorical.R PROJECT (NEW): project21.pdf apprentice.txt |
|
49 |
Tue: hat-matrix; residuals; standardised residuals, studentised residuals; Cook's distance, DFBETAs; intro GLMs
|
|
slides_11.pdf See also "lecture 14" in Jörnsten's notes. (Additional support: Agresti's book chapters 4 and 14.4) |
leverage-residuals-cook.R f6data.txt
|
|
49 |
Thu: the exponential family; Newton-Raphson; Poisson regression; confidence intervals for GLMs; asymptotic properties of the MLE; |
|
slides_12.pdf slides_12_annotated.pdf |
poisson_nb.R f10.txt f10b.txt poissregr-awards.R poisson_sim.csv |
|
50 |
Tue: CI for predictions; Wald test; deviance; likelihood ratio test; Poisson + offset term.
|
|
slides_13_annotated.pdf slides_13.pdf |
shipdamage.R (we also rediscussed poisson_nb.R and poissregr-awards.R previously uploaded= |
|
50 |
Thu: (there WILL be lecture) Negative binomial regression also with offset. Quick tour through GLM diagnostics
|
slides_14.pdf |
poisson_nb.R (updated from line 172 onward with a comparison of residuals for Poisson vs NB regression) |
Computer labs and software
Software: We will use the statistical package R to analyze data, powered via the Rstudio interface. You will need to install both on your computer, see the instructions.
No previous knowledge of R is required. But you are encouraged to attend the lab on Thursday 4 November. No further computer lab will be given.
Some useful resources:
- R: there are lots (too many!) resources online to learn about R: here is just a possible one to get started, from Uni. Copenhagen http://r.sund.ku.dk/index.html (Links to an external site.)
If you are familiar to MATLAB or Python, the following may be useful:
- a MATLAB/R cheat sheet is at http://mathesaurus.sourceforge.net/octave-r.html (Links to an external site.)
- a MATLAB/Python/R cheat sheet: http://mathesaurus.sourceforge.net/matlab-python-xref.pdf (Links to an external site.)
Course summary:
Date | Details | Due |
---|---|---|