MACHINE LEARNING FOR ECONOMICS

Attività formativa monodisciplinare
Codice dell'attività formativa: 
110025-ENG

Scheda dell'insegnamento

Per studenti immatricolati al 1° anno a.a.: 
2018/2019
Insegnamento (nome in italiano): 
MACHINE LEARNING FOR ECONOMICS
Insegnamento (nome in inglese): 
MACHINE LEARNING FOR ECONOMICS
Tipo di attività formativa: 
Attività formativa Caratterizzante
Tipo di insegnamento: 
Obbligatoria
Settore disciplinare: 
STATISTICA (SECS-S/01)
Anno di corso: 
2
Anno accademico di offerta: 
2019/2020
Crediti: 
6
Responsabile della didattica: 
Mutuazioni

Altre informazioni sull'insegnamento

Modalità di erogazione: 
Didattica Convenzionale
Lingua: 
Inglese
Ciclo: 
Secondo Semestre
Obbligo di frequenza: 
No
Ore di attività frontale: 
48
Ore di studio individuale: 
102
Ambito: 
Statistico-matematico
Materiali didattici: 
Prerequisites

Good knowledge of Statistics (i.e. probability, inferential statistics, regression model).
Previous exposure to a programming language, such as R or Python, is useful.

Educational goals

The course aims at providing the knowledge of the cutting-edge statistical tools for the modeling and understanding of complex and big data. These methods aim to automatically detect patterns in data (i.e. to “learn” from data) and the uncovered patterns can then be used by the analyst to make accurate predictions and decisions under uncertainty.

At the end of the course the student will gain the ability to:

a) choose and apply the appropriate statistical tool, in the class of statistical learning methods, for the analysis of different types of complex data coming from real-world problems;

b) use the open-source statistical software R (freely available for download at http://www.r-project.org) for statistical analysis, modeling and prediction;

c) interpret the results in a decision making perspective.

Course content

- Introduction to machine learning: supervised versus unsupervised learning, the bias-variance trade-off.
- Regression: review of simple, multiple and logistic regression, non-linear regression models, ridge and lasso regression.
- Resampling methods: cross-validation and bootstrap.
- Classification: regression trees, bagging, random forests, boosting.
- Unsupervised learning: principal components analysis and clustering.
- Elements of spatial data analysis: areal data, spatial autocorrelation, models for areal data, disease mapping.

Textbooks and reading lists

The official course book is: 
James G., Witten D., Hastie T., Tibshirani R. (2013). An introduction to statistical learning with applications in R. Springer. 



More information about the book at the following links:

-https://www.springer.com/us/book/9781461471370
- https://www-bcf.usc.edu/~gareth/ISL/


About R software, documentation is freely available at the following link: https://www.r-project.org/other-docs.html

Teaching methods

The course consists in class lectures and R lab sessions. The lectures & labs calendar will be published before the beginning of the course on the e-learning platform; labs will take place within the hours scheduled for the course (roughly one-third of classroom time).

Assessment and Evaluation

The exam consists in:

- a written test including open-ended and test questions (concerning theoretical topics or short applications of the studied methods); 

- exercises to be solved using the R software (in order to evaluate the ability of the student in analysing data and interpreting software outputs). 


The theoretical and practical sections are each worth 50% of the total score, approximately.

Further information

Attending class lectures and R labs is strongly recommended.