Visa svensk kursplan
Data Mining and Statistical Learning, 15 ECTS Credits
COURSE CATEGORY   Course within Master´s Programme in Statistics, Data Analysis and Knowledge Discovery
  COURSE CODE   732A20
The course lays the foundation for professional work and research in which large amounts of data are explored, modified, modelled and assessed to uncover previously unknown patterns and trends.

Having completed the course, the student should be able to:
- utilize powerful statistical software to explore large and complex data sets, derive data-based predictors and classifiers, and assess the performance of such tools,
- use knowledge about powerful techniques for data-based prediction and classification,
- display a a good understanding of the major principles for statistical learning from data,
- demonstrate insightful assessment of the quality of given data sets and the information content on which predictions and classifications can be based.
The course content comprises practical as well as theoretical elements, for example:
- computer exercises,
- basic concepts in statistical learning, in particular supervised learning,
- model selection strategies involving the use of training sets, validation sets, and test sets,
- decision trees and linear classification methods, such as discriminant analysis,
- classification and prediction based on neural networks, support vector machines, and generalized additive models, including logistic regression,
- ridge regression, spline smoothers and roughness penalty techniques,
- ensemble methods, including bagging and boosting.
Computer exercises in which the students have access to supervision provide practical experience of data analysis. The teaching comprises lectures, seminars, and computer exercises. The lectures are devoted to presentations of theories, concepts, and methods. The seminars comprise student presentations and discussions of assignments.
Language of instruction: English.
Assignments encompassing computer-based data analysis. One final written examination.

Students failing an exam covering either the entire course or part of the course two times are entitled to have a new examiner appointed for the reexamination.

Students who have passed an examination may not retake it in order to improve their grades.

Students entering the course should have passed at least one course in basic statistics and be familiar with linear statistical models, in particular simple and multiple regression. Also, it is a prerequisite that the students have passed courses in calculus and linear algebra.
Documented knowledge of English equivalent to "Engelska B"; i.e. English as native language or an internationally recognized test, e.g. TOEFL (minimum scores: Paperbased 550 + TWE-score 4.0, computorbased 213 and internetbased 79), IELTS, academic (minimum score: Overall band 6.0 and no band under 5.0), or equivalent.
The course is graded according to the ECTS grading scale A-F
Course certificate is issued by the Faculty Board on request. The Department provides a special form which should be submitted to the Student Affairs Division.
The course literature is decided upon by the department in question.
Planning and implementation of a course must take its starting point in the wording of the syllabus. The course evaluation included in each course must therefore take up the question how well the course agrees with the syllabus.

The course is carried out in such a way that both men´s and women´s experience and knowledge is made visible and developed.
Data Mining and Statistical Learning
Data Mining and Statistical Learning
Department responsible
for the course or equivalent:
MAI - Department of Mathematics
Registrar No: 1330/06-41   Course Code: 732A20      
    Exam codes: see Local Computer System      
Subject/Subject Area : Statistik - STA          
Level   Education level     Subject Area Code   Field of Education  
A1X   Advanced level     STA   SA  
The syllabus was approved by the Board of Faculty of Arts and Science 2006-12-18