Red X iconGreen tick iconYellow tick icon


    Statistical learning techniques commonly used to analyse high-dimensional (or multivariate) data. Penalised regression, classification trees, clustering, dimension-reduction, bagging, stacking, boosting, random forests and ensemble learning.

    Statistical learning techniques commonly used to analyse high-dimensional (or multivariate) data. Principal component analysis, clustering, dimensionality reduction, classification methods, tree-based methods, penalised regression.

    The aim of this paper is to introduce students to many of the statistical learning techniques that are now used to analyse high-dimensional data. Students will learn the underlying rationale for each method and gain practice in using it on real data in R.

    About this paper

    Paper title Statistical Learning
    Subject Statistics
    EFTS 0.1667
    Points 20 points
    Teaching period Semester 2 (On campus)
    Domestic Tuition Fees ( NZD ) $1,240.75
    International Tuition Fees Tuition Fees for international students are elsewhere on this website.
    STAT 401 or (STAT 260 and STAT 270 and STAT 310), or equivalent (contact department for further information)
    STAT 312

    Dr Matthew Parry

    Teaching staff

    Dr Matthew Parry

    Paper Structure

    Main topics:

    • Principal component analysis.
    • Exploratory factor analysis.
    • Clustering methods.
    • Dimensionality reduction.
    • Classification methods.
    • Tree-based methods.
    • Penalised regression.
    • Multiple testing.

    Textbooks are not required for this paper.

    Graduate Attributes Emphasised

    Interdisciplinary perspective, Lifelong learning, Scholarship, Communication, Critical thinking, Research.
    View more information about Otago's graduate attributes.

    Learning Outcomes

    Students who successfully complete the paper will be able to:

    • Describe the issues that arise when analysing high-dimensional data.
    • Use a range of statistical learning techniques to analyse real, high-dimensional data using R.
    • Determine which is the most appropriate technique for a given research objective.
    • Interpret the results of the analysis, including any assumptions or limitations.
    • Write a clear and succinct report on an analysis of high-dimensional data for a collaborator or potential client.
    • Give an oral presentation of their results.


    Semester 2

    Teaching method
    This paper is taught On Campus
    Learning management system


    Stream Days Times Weeks
    A1 Monday 12:00-12:50 29-35, 37-42
    Thursday 12:00-12:50 29-35, 37-42
    Friday 13:00-13:50 29-35, 37-42


    Stream Days Times Weeks
    A1 Tuesday 14:00-14:50 29-35, 37-42
    Friday 15:00-15:50 29-35, 37-42
    Back to top