Accessibility Skip to Global Navigation Skip to Local Navigation Skip to Content Skip to Search Skip to Site Map Menu

COMP120 Practical Data Science

An introduction to the techniques used to prepare, integrate, manage and visualise complex data using modern software environments. Essential to students needing to manage data in science, business, or humanities.

Data Science skills are in increasing demand in both industry and academia. Being able to effectively and safely manipulate and model data will be a key strategic advantage for future employment. COMP 120 introduces the fundamental concepts of data science to students through practical use of the industry-standard software environment R. You will learn how to program in R, how to effectively manage and manipulate data in R, and be exposed to the “round trip” of data science (Import, Tidy, Transform, Visualise, Model, and Communicate).

Upon completion of COMP 120, you will be well-equipped to embark on your own data acquisition and management in R, as well as be excellently-prepared for other papers that use R for analysis and modelling.

Paper title Practical Data Science
Paper code COMP120
Subject Information Science
EFTS 0.1500
Points 18 points 18 points
Teaching period(s) First Semester, Second Semester
Domestic Tuition Fees (NZD) $1,059.15
International Tuition Fees (NZD) $4,627.65

^ Top of page

Schedule C
Arts and Music, Commerce, Science
Contact

Grant Dick, Department of Information Science

Teaching staff

Grant Dick, Department of Information Science

Paper Structure

This paper covers the following key themes:

  1. Introduction to programming in R and RStudio
  2. Importing and tidying data in R (“Data Wrangling”)
  3. Plotting and visualising data in R
  4. Data aggregation and summarisation in R
  5. Semi-structured data manipulation using Web Scraping as an example
  6. Building models in R

Note that the modelling aspect in this paper introduces the framework through which models are built in R, and is not intended as a substitute for other modelling papers.

Teaching Arrangements

2 x 1-hour lecture per-week

1 x 2-hour lab per week

Textbooks

Wickham and Grolemund, R for Data Science O’Reilly, 2016 (available online)

Course outline
View the most recent Course Outline
Graduate Attributes Emphasised
Communication, Information literacy, Research.
View more information about Otago's graduate attributes.
Learning Outcomes

Upon completion of COMP 120, students should be able to:

  1. automate data manipulation tasks using a contemporary software package;
  2. develop basic scripts to perform data management tasks;
  3. use relevant software to clean, manage and integrate data;
  4. create visualisations from data sources using appropriate software; and
  5. manage and share data projects using version control systems and repositories.

^ Top of page

Timetable

First Semester

Location
Dunedin
Teaching method
This paper is taught On Campus
Learning management system
Blackboard

Computer Lab

Stream Days Times Weeks
Attend one stream from
A1 Tuesday 14:00-15:50 9-16, 18-22
A2 Friday 09:00-10:50 9-15, 18-22

Lecture

Stream Days Times Weeks
Attend
A1 Monday 14:00-14:50 9-16, 18-22
Tuesday 09:00-09:50 9-16, 18-22

Second Semester

Location
Dunedin
Teaching method
This paper is taught On Campus
Learning management system
Blackboard

Computer Lab

Stream Days Times Weeks
Attend one stream from
A1 Thursday 13:00-14:50 28-34, 36-41
A2 Thursday 11:00-12:50 28-34, 36-41

Lecture

Stream Days Times Weeks
Attend
A1 Monday 14:00-14:50 28-34, 36-41
Tuesday 09:00-09:50 28-34, 36-41