Accessibility Skip to Global Navigation Skip to Local Navigation Skip to Content Skip to Search Skip to Site Map Menu

COMP120 Practical Data Science

An introduction to the techniques used to prepare, integrate, manage and visualise complex data using modern software environments. Essential to students needing to manage data in science, business, or humanities.

Data Science skills are in increasing demand in both industry and academia. Being able to effectively and safely manipulate and model data will be a key strategic advantage for future employment. COMP 120 introduces the fundamental concepts of data science to students through practical use of the industry-standard software environment R. You will learn how to program in R, how to effectively manage and manipulate data in R, and be exposed to the "round trip" of data science (Import, Tidy, Transform, Visualise, Model, and Communicate).

Upon completion of COMP 120, you will be well-equipped to embark on your own data acquisition and management in R, as well as be excellently-prepared for other papers that use R for analysis and modelling.

Paper title Practical Data Science
Paper code COMP120
Subject Computer and Information Science
EFTS 0.15
Points 18 points
Teaching period(s) Semester 1 (On campus)
Semester 2 (On campus)
Domestic Tuition Fees (NZD) $1,141.35
International Tuition Fees Tuition Fees for international students are elsewhere on this website.

^ Top of page

Schedule C
Arts and Music, Commerce, Science
Contact

Associate Professor Tony Savarimuthu
Department of Information Science
tony.savarimuthu@otago.ac.nz

Teaching staff

Associate Professor Tony Savarimuthu
Department of Information Science

Paper Structure

This paper covers the following key themes:

  1. Introduction to programming in R and RStudio
  2. Importing and tidying data in R ("Data Wrangling")
  3. Plotting and visualising data in R
  4. Data aggregation and summarisation in R
  5. Semi-structured data manipulation using Web Scraping as an example
  6. Building models in R

Note that the modelling aspect in this paper introduces the framework through which models are built in R, and is not intended as a substitute for other modelling papers.

Teaching Arrangements

2 x one-hour lectures per week

1 x two-hour lab per week

Textbooks

Wickham and Grolemund, R for Data Science O’Reilly, 2016 (available online)

Course outline

View the most recent Course Outline here

Graduate Attributes Emphasised
Communication, Information literacy, Research.
View more information about Otago's graduate attributes.
Learning Outcomes

Upon completion of COMP 120, students should be able to:

  1. Automate data manipulation tasks using a contemporary software package
  2. Develop basic scripts to perform data management tasks
  3. Use relevant software to clean, manage and integrate data
  4. Create visualisations from data sources using appropriate software
  5. Manage and share data projects using version control systems and repositories

^ Top of page

Timetable

Semester 1

Location
Dunedin
Teaching method
This paper is taught On Campus
Learning management system
Blackboard

Computer Lab

Stream Days Times Weeks
Attend one stream from
A1 Tuesday 14:00-15:50 9-14, 16, 18-22
A2 Tuesday 16:00-17:50 9-14, 16, 18-22
AND one stream from
B1 Wednesday 11:00-12:50 17
B2 Friday 11:00-12:50 17

Lecture

Stream Days Times Weeks
Attend
A1 Monday 14:00-14:50 9-14, 16-22
Tuesday 09:00-09:50 9-14, 16, 18-22

Semester 2

Location
Dunedin
Teaching method
This paper is taught On Campus
Learning management system
Blackboard

Computer Lab

Stream Days Times Weeks
Attend one stream from
A1 Friday 10:00-11:50 29-34, 36-41
A2 Friday 12:00-13:50 29-34, 36-41

Lecture

Stream Days Times Weeks
Attend
A1 Monday 14:00-14:50 28-34, 36-41
Tuesday 09:00-09:50 28-34, 36-41