Intermediate R
The Course
It has been said that 80% of data analysis is spent on the process of cleaning and preparing the data. In this course we introduce some relatively-new additions to the R programming language; dplyr and ggplot2. In combination these provide a powerful toolkit to make the process of manipulating and visualising data easy and intuitive.
Course material and exercises are available to view as rendered HTML at https://github.com/mrccsc/r-intermediate.
All material is available to download under GPL v2 license.
For information on other courses run by our team see our github IO page.
The Team
This course was created and conducted as part of the shared-bioinformatics-training consortium.
For more information on the team see our github IO page.
This course is free for MRC CSC and Imperial staff and students. If you would like to attend a future course contact thomas.carroll@imperial.ac.uk.
Setting up.
Install R.
R can be installed from the R-project website.
R 3.1.0 or higher is required for this course.
Install RStudio.
RStudio can be installed from the R-project website.
Install required packages.
Option 1 - (For your own personal computers)
Having downloaded R and RStudio, some additional packages are required (rmarkdown, dplyr, ggplot2 etc ).
To install these,
- First launch RStudio
- Install the packages in the R console using devtools
options(repos = c("CRAN" = "http://cran.ma.imperial.ac.uk")) install.packages(c("tidyr","ggplot2","dplyr","stringr","lubridate","mangoTraining","readr"))
Download the material
The material can either be downloaded as a zip
wget https://github.com/mrccsc/r-intermediate/archive/master.zip ./
The R Sessions
Introduction
This section introduces us to the course and its motivation.
Link to HTML page - Session 1
Introduction to dplyr
In this session we introduce the basics of data manipulation using tidyr and dplyr.
Writing analysis workflows in R
In this session we look at "piping" data using the magrittr package.
Link to HTML page - Session 3Summarising and Combining data
In this session we look at grouping and summarising tools using dplyr.
Plotting data with ggplot2
In the final session we look at using ggplot2 to visualise tidy data.