Data in R - An introduction to data manipulation and cleaning in R

Level: Introductory, No Experience Required

Keywords: R, Data Manipulation, Data Cleaning, Analytics

Note: Please see Prerequisite Section Below


View Session Slides - Opens new Tab

Download Session Files - Downloads .zip file


Session Summary:

It is said approximately 80-90% of a statistician, data scientist or analysts time is spent cleaning, manipulating, or exploring presented data. During this interactive practical, we will dive into both the reasons for these processes, as well as how to undertake basic cleaning, manipulation, selection and transformations of data, during both standardized and conditional situations. This session is aimed at complete beginners, or those wanting to refresh their data handling skills in R.

In particular we will focus on:

  • The identification or removing of missing values.
  • The selection of variables and observations.
  • The transformation and manipulation of data, in line with psychometric procedures.
  • How to generate basic summary statistics, for reporting descriptive details.

Session Objectives:

  • Understand the importance of data manipulation and cleaning for good data analytic practices.
  • Understand how to clean, manipulate, select, and transform data both under standardized and conditional conditions.

Transferable Skills:

  • Data Manipulation
  • Data Cleaning
  • Basic R Programming, using the tidyverse

Prerequisite Knowledge:

No experience in Programming is required, however some awareness provided through the Training Tracker: “Analytical Training: Awareness of Coding Tools” & “Awareness in Data Visualisation” and associated courses through the learning academy, is required

Prerequisite Content:

Access to R & Rstudio (R’s Graphical User Interface, or RStudio Cloud (Free Online)), Provided ZIP File .zip