Course syllabus

This page contains the program of the course: lectures, exercise sessions and computer labs. Other information, such as learning outcomes, teachers, literature and examination, are in a separate course PM. The course is in-person for what concerns the lectures (no Zoom, no video-recorded lectures). For compulsory mini-analyses presentations it will be possible to connect via zoom or be in the room (see below).

 

Program

The schedule of the course is in TimeEdit

Lectures will be "in-person" in University rooms and will not be recorded.

We will have a single in-person non-compulsory computer lab at 15.15-17.00 on 6 November. Bring your own laptop. This lab is NOT structured with presentations of concepts. The lab is useful to get you started with the R/Rstudio software if you have no previous experience. If you already have some experience with R you probably won't need it.

Notice, there are some activities for which attendance is mandatory either in room or via zoom. These are the Fridays "mini-analyses". See the course PM for details and what to do in case you are unable to attend.

 

Lectures (below is a  plan based on last year. Deviations may occur).

 

week Topics slides/notes code and data files
MiniAnalysis
45

Tuesday: General intro to the course and mention of several topics: bias in linear regression, least squares; parameters interpretation in simple linear regression;

 

 

 

lab0.pdf

 

 

 

no mini analysis this week

45

Thursday: least squares estimates and relation to correlation; interpretation of coefficients with transformed variables. the 5 basic assumptions and residual plots.

 

 

 

 

 

no mini analysis this week

45

Thursday 6 November Computer lab: BRING YOUR LAPTOP with R/RStudio in MVF24 and MVF25: the lab is not structured with presentation of concepts. We are there to help you go through some basic exercise for those that are new to R.

 

 

 

 no mini analysis this week

46

Tuesday: Proof of the unbiasedness of OLS parameter estimators; variance of the estimators; some residuals-based diagnostics; box-cox transformations; leverage values;

 

 

 

 

 

46

Thursday: deletion-based diagnostics; MSE and proof of unbiasedness of MSE; unbiasedness of yhat;  expectation and variance of residuals; variability decomposition (SSEr, SStor, SSRegr); Rsquared (coeff of determination); t-test construction

Optional exercises for self-study from Rawling's book Links to an external site.: exercise 1.1, exercise 1.4, relevant bits of exercise 1.9, ex. 1.10, ex. 1.16,

 

 

 

 

Friday 14 Nov *: present  work for mini-analysis 1.

 

 

47

Tuesday: completing t-tests; pvalues;   confidence intervals for parameters and for E(Y0). Prediction intervals for Ypred0.

 

 

no mini analysis this week

47

Thursday: The Simpson's paradox and notation for multiple lin. regression.  Properties of the estimators in multiple regression and sampling distributions. t-test and categorical covariates (not everything); Topics  also found in chapter 9 ("Class variables") in Rawlings et al Links to an external site. up to sect. 9.3.

 

Optional exercises for self-study from Rawling's book Links to an external site.: exercise 3.5(part a and d), ex. 3.10, ex. 3.11, ex. 3.12 (we never do regression without intercept...but if you are interested...), ex 3.13

 

 

 

 

 

 no mini analysis this week

48

Tue: models with categorical and numerical covariates. 

 

.

 

 

 

 

 

48

Thu:  problems with p-values and large datasets; Multicollinearity;  Variance Inflation Factor; Partial F test;

 

 

 

Friday 28 Nov at 13.15*: present minianalysis 2

(this event is typically 60-70 minutes long)

 

 

49

Tue: greedy variables selection: backward search. Bias/variance tradoff and the pMSE. Training and testing, PMSE with regsubsets.

 

 

 

 

no minianalysis this week

49

 

Thu: (reprise the end of slides 9); more on pMSE with categorical covariates; interactions

 

 

 

no minianalysis this week

 

50

Tue: ; adj-Rsquared; Kullback-Leibler; AIC, BIC; K-fold CV;

 

 

 

50

Thu: LOOCV; hat-matrix; residuals; standardised residuals,  studentised residuals; Cook's distance, DFBETAs; intro GLMs;

 

 

 

 

 

 

Friday 12 Dec at 13.15*: present minianalysis 3

 

 

51

Tue: the exponential family; Newton-Raphson; Poisson regression; confidence intervals for GLMs;  asymptotic properties of the MLE; CI for predictions; Wald test; deviance; likelihood ratio test; 

 

 

 

 

 

no mini analysis this week
51

Thu: Poisson + offset term. Negative binomial regression also with offset. Quick tour through GLM diagnostics (diagnostics can be skipped for the exam)

 

 

 

no mini analysis this week

 

Back to the top

 

Computer lab and software

You are encouraged to attend the lab on Thursday 6 November to experiment with some basic analyses. No further computer lab will be given.

Software: We will use the statistical package R to analyze data, powered via the Rstudio interface. You will need to install both on your computer, see the page Installing R and Rstudio.
No previous knowledge of R is required.

Some useful resources:

If you are familiar to MATLAB or Python, the following may be useful:

 

Back to the top

Course summary:

Course Summary
Date Details Due