Overview
An introduction to the statistical learning techniques commonly used to analyse high-dimensional (or multivariate) data. Penalised regression, classification trees, clustering, dimension-reduction, bagging, stacking, boosting, random forests and ensemble learning.
The aim of this paper is to introduce students to many of the statistical learning techniques that are now used to analyse high-dimensional data. Students will learn the underlying rationale for each method and gain practice.
About this paper
Paper title | Modelling High Dimensional Data |
---|---|
Subject | Statistics |
EFTS | 0.15 |
Points | 18 points |
Teaching period | Semester 2 (On campus) |
Domestic Tuition Fees ( NZD ) | $981.75 |
International Tuition Fees | Tuition Fees for international students are elsewhere on this website. |
- Prerequisite
- One of (ECON 210 or FINC 203 or STAT 210 or STAT 241) and STAT 260
- Restriction
- STAT 242, STAT 342, STAT 425
- Schedule C
- Arts and Music, Science
- Contact
- Teaching staff
Dr Matthew Parry
- Paper Structure
The main topics of this paper are:
- Penalised regression
- Classification trees
- Clustering
- Dimension-reduction
- Bagging, stacking and boosting
- Random forests
- Ensemble learning
- Textbooks
Textbooks are not required for this paper.
- Graduate Attributes Emphasised
- Lifelong learning, Communication, Critical thinking, Research.
View more information about Otago's graduate attributes. - Learning Outcomes
On completion of this paper, students will be able to:
- Describe the issues that arise when analysing high-dimensional data.
- Use a range of statistical learning techniques to analyse real, high-dimensional data using R
- Determine which is the most appropriate technique for a give research objective
- Interpret the results of the analysis, including any assumptions or limitations
- Write a clear and succinct report on an analysis of high-dimensional data collaborator or potential client
Timetable
Overview
An introduction to the statistical learning techniques commonly used to analyse high-dimensional (or multivariate) data. Penalised regression, classification trees, clustering, dimension-reduction, bagging, stacking, boosting, random forests and ensemble learning.
The aim of this paper is to introduce students to many of the statistical learning techniques that are now used to analyse high-dimensional data. Students will learn the underlying rationale for each method and gain practice.
About this paper
Paper title | Modelling High Dimensional Data |
---|---|
Subject | Statistics |
EFTS | 0.15 |
Points | 18 points |
Teaching period | Semester 2 (On campus) |
Domestic Tuition Fees | Tuition Fees for 2025 have not yet been set |
International Tuition Fees | Tuition Fees for international students are elsewhere on this website. |
- Prerequisite
- One of (ECON 210 or FINC 203 or STAT 210 or STAT 241) and STAT 260
- Restriction
- STAT 242, STAT 342, STAT 425
- Schedule C
- Arts and Music, Science
- Contact
- Teaching staff
- Paper Structure
The main topics of this paper are:
·Principal component analysis.
·Exploratory factor analysis.
·Clustering methods.
·Dimensionality reduction.
·Classification methods.
·Tree-based methods.
·Penalised regression.
·Multiple testing.
- Textbooks
Textbooks are not required for this paper.
- Graduate Attributes Emphasised
- Lifelong learning, Communication, Critical thinking, Research.
View more information about Otago's graduate attributes. - Learning Outcomes
On completion of this paper, students will be able to:
- Describe the issues that arise when analysing high-dimensional data
- Use a range of statistical learning techniques to analyse real, high-dimensional data using R
- Determine which is the most appropriate technique for a give research objective
- Interpret the results of the analysis, including any assumptions or limitations
- Write a clear and succinct report on an analysis of high-dimensional data collaborator or potential client