Laptop Setup Instructions
Instructions for setting up your laptop can be found here: Laptop Setup Instructions
Difference Between R and R Studio
RStudio doesn’t know where libraries are installed, when they are not installed through the RStudio package manager. To tell RStudio the location, you can define the path in a startup file. Create a file called .Renviron
. Inside there:
R_LIBS=<R Library Path of other installed packages>
That was the problem when students installed things in R Studio at the command line using the R command install.package()
.
… or you could use the package manger to install libraries.
Syntax highlighting
… of scripts in the R editor does not seem to work under Windows. If you want highlighted syntax, use RStudio instead. Or (seriously), get a Mac.
Pre-Workshop Tutorials
1) R Preparation tutorials: You need to be familiar with the material covered in the Introduction to R tutorial, below. The tutorial should be very accessible even if you have never used R before.
Day 1
Welcome
*Faculty: Michelle Brazas*Data Sets for Workshop Modules 1 – 5
Table_S3.csv
GvHD.txt
hiv.raw.data.24h.txt
logcho_237_4class.txt
GSE26922.dat
TsneRef.dat
Module 1: Exploratory Data Analysis
*Faculty: Boris Steipe*Lecture:
Stats2015_Module1.pdf
Stats2015_Module1.ppt
Stats2015_Module1.mp4
Scripts:
Resources:
Helpful links
- The R help mailing list
- Rseek: the specialized search engine for R topics
- R questions on stackoverflow
- The Comprehensive R Archive Network CRAN
- The CRAN task-view collection
- Bioconductor task views
Module 2: Regression Analysis
*Faculty: Boris Steipe*Lecture:
Stats2015_Module2.pdf
Stats2015_Module2.ppt
Stats2015_Module2.mp4
Scripts:
Links:
- Maximal Information Coefficient
- Homepage for data exploration with the MIC measure
- CRAN: package MINERVA (R wrapper for a fast mine implementation)
Module 3: Dimension Reduction
*Faculty: Boris Steipe*Lecture:
Stats2015_Module3.pdf
Stats2015_Module3.ppt
Stats2015_Module3.mp4
Scripts:
2015_EDA_Module_3_DimensionReduction.R
Integrated Assignment
*Faculty: Catalina Anghel and David Shih*Part 1:
Assignment Part 1:
Stats2015_IntegratedAssignment_Part1.pdf
Questions in R - Part 1:
Stats2015_IntegratedAssignment_Part1_Questions.R
Answer Key - Part 1:
Stats2015_IntegratedAssignment_Part1_AnswerKey.R
Data Set:
R package: CCLE_0.1.1.tar.gz (RStudio users on any platform), CCLE_0.1.1.zip (non-RStudio users on Windows) (Note: Please right-click and select “Save link as…” or “Save target as…”)
For your reference:
Preprocessing scripts: CCLE_preprocess.zip
Stats2015_IntegratedAssignment_Heatmap.R
Day 2
Module 4: Clustering Analysis
*Faculty: Boris Steipe*Lecture:
Stats2015_Module4.pdf
Stats2015_Module4.ppt
Stats2015_Module4.mp4
Scripts:
Links:
- Comparison of Clustering Methods
- R-“task view”: Cluster Analysis (and Finite Mixture Models)
Dataset:
If you load using this file:
- Gset.RData do
load("gset.RData")
on the command line. (Check that ‘gset’ is actually lower case in the folder. You might need a capital letter at the start.)
- Plaft.RData do
load("platf.RData")`
R object file: GSE26922.rds
Read with:
gset <- readRDS("GSE26922.rds")
# do not run the following line:
gset <- gset [ [ idx ] ]
Module 5: Hypothesis Testing for EDA
*Faculty: Boris Steipe*Lecture:
Stats2015_Module5.pdf
Stats2015_Module5.ppt
Stats2015_Module5.mp4
Script
2015_EDA_Module_5_HypothesisTesting_Corrected.R
Links:
Integrated Assignment - Part 2
*Faculty: Catalina Anghel and David Shih*Assignment Part 2:
Stats2015_IntegratedAssignment_Part2.pdf
Questions in R - Part 2:
Stats2015_IntegratedAssignment_Part2_Questions.R
Answer Key - Part 2:
Stats2015_IntegratedAssignment_Part2_AnswerKey.R
Other Readings
Other (more advanced) resources:
Manuals:
More detailed introduction to R. Not a basic tutorial, this is for people who really want to know more about R.
http://cran.r-project.org/doc/manuals/R-intro.html
Books:
1) “Introductory Statistics with R” by Peter Dalgaard. It is not required for this workshop but if you are interested in buying a good book and/or want to know more, you might want to consider getting a copy. The UofT library has an online version.
2) Statistics for Biology and Health by Robert Gentleman, Vincent Carey, Wolfgang Huber, Rafael Irizarry and Sandrine Dudoit
3) Building Bioinformatics Solutions with Perl, R and MySQL by Conrad Bessant, Ian Shadforth and Darren Oakley
Post-workshop Readings
- Another good paper from Gentleman: Statistical_applications_in_genetics_and_molecular_biology_2005_Gentleman
- A tutorial article in PLoS Computational Biology: A Quick Guide to Teaching R Programming to Computational Biology Students