Laptop Setup Instructions
Instructions for setting up your laptop can be found here: Laptop Setup Instructions
Difference Between R and R Studio
RStudio doesn’t know where libraries are installed, when they are not installed through the RStudio package manager. To tell RStudio the location, you can define the path in a startup file. Create a file called .Renviron
. Inside there:
R_LIBS= <R Library Path of other installed packages>
That was the problem when students installed things in R Studio at the command line using the R command install.package()
.
… or you could use the package manger to install libraries.
Syntax highlighting
… of scripts in the R editor does not seem to work under Windows. If you want highlighted syntax, use RStudio instead.
Pre-Workshop Tutorials
1) R Preparation tutorials: You are expected to have completed the following tutorials in R beforehand. The tutorial should be very accessible even if you have never used R before.
2) Cytoscape 3.x Preparation tutorials: Complete the introductory tutorial to Cytoscape 3.x: http://opentutorials.cgl.ucsf.edu/index.php/Portal:Cytoscape3
- Introduction to Cytoscape3 - User Interface
- Introduction to Cytoscape3 - Welcome Screen
- Introduction to Cytoscape 3.1 - Networks, Data, Styles, Layouts and App Manager
3) UNIX Preparation tutorials: Please complete tutorials #1-3 on UNIX at http://www.ee.surrey.ac.uk/Teaching/Unix/
Pre-Workshop Readings
Database resources of the National Center for Biotechnology Information
COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer
Integrative genomic profiling of human prostate cancer
Predicting the functional impact of protein mutations: application to cancer genomics
Cancer genome sequencing study design
Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy
The UCSC Genome Browser database: extensions and updates 2013
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration
Feature-based classifiers for somatic mutation detection in tumour–normal paired sequencing data
Expression Data Analysis with Reactome
Logging into the Amazon cloud
Instructions can be found here.
- These instructions will ONLY be relevant in class, as the Cloud will not be accessible from home in advance of the class.
Day 1
Module 1: Introduction to cancer genomics
*Faculty: John McPherson*Lecture: BiCG_2015_Module1.pdf
Module 2: Databases and Visualization Tools
*Faculty: Francis Ouellette*Lecture:
BiCG_2015_Module2.pdf
BiCG_2015_Module2.ppt
BiCG_2015_Module2.mp4
Toy Data Sets:
Chromosome 21: 19,000,000-20,000,000
HCC1143.normal.21.19M-20M.bam.bai
Other Resources on IGV:
Links:
*
ICGC
*
DCC
portal
on
ICGC
*
Docs
for
ICGC
*
Integrated
Genomics
Viewer
*
UCSC
Genome
Browser
*
UCSC
Genome
Browser
*
Cancer
Genome
Workbench
*
cBioPortal
for
Cancer
Genomics
*
Savant
Genome
Browser
R Review Session
*Faculty: Sorana and Fouad*Lecture:
Lab Practical:
Links:
* R Studio
Day 2
Module 3: Alignment and Genome rearrangements
*Faculty: Jared Simpson*Lecture:
BiCG_2015_Module3.pdf
BiCG_2015_Module3.ppt
BiCG_2015_Module3.mp4
Lab Practical:
Installation Instructions Module 3
BiCG_2015_Module3_Lab1.txt
BiCG_2015_Module3_Lab2.txt
Bonus: You can view your results (BAM and BAM.BAI file) in the IGV browser by using the URL for that file from your Cloud instance. We have a web server running on the Amazon cloud for each instance. In a browser, like Firefox, type in your server name (cbw#.dyndns.info) and all files under your workspace will be shown there. Find your Bam and Bam.Bai file, right click it and ‘copy the link location’. Start IGV and choose ‘load from URL’ from File menu, and then paste the location you just copied and you will see the Bam file you just generated in IGV! Narrow down the view to chromosome 15 or 17 where the break points were identified.
Links:
- What does my SAM flag mean? https://broadinstitute.github.io/picard/explain-flags.html
* Tools for Mapping High-throughput Sequencing Data Paper
* SAM/BAM file specifications
* samtools
* Picard
* bwa
* GASV
* BreakDancer
Extras:
example sam header
sam flags explained
Module 4: Gene Fusion Discovery
*Faculty: Andrew McPherson*Lecture:
BiCG_2015_Module4.pdf
BiCG_2015_Module4.ppt
BiCG_2015_Module4.mp4
Lab Practical:
Lab 1:
Module 4 Prediction Lab Run
(For reference purposes, the install instructions for all data/tools can be found here )
Lab 2:
Lab 3:
Papers and Background Material:
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks ENCODE RNA-seq Standards
Links:
* BioStar
* SeqAnswers
* Integrative Genomics Viewer (IGV)
* FASTQ format
* SAM/BAM format
* Illumina iGenomes
* SamTools
* Picard
* FastQC
* SAMStat
* Bowtie
* Bowtie2
* TopHat/TopHat2
* Cufflinks/Cuffdiff
* CummeRbund
Day 3
Module 5: Copy Number Alterations
*Faculty: Sohrab Shah*Lecture:
BiCG_2015_Module5.pdf
BiCG_2015_Module5.ppt
BiCG_2015_Module5.mp4
Lab Practical:
- Data Preparation for Copy Number Lab
- Software Installation for Copy Number Lab
- CNA Data Analysis Package
Data for Lab:
Plots for Lab
- Oncosnp
- TITAN
Links
* PennCNV-Affy: In-depth guide into pre-processing of Affymetrix 6.0 microarrays for OncoSNP
* OncoSNP
* Titan
* SnpEff/SnpSift
Module 6: Somatic Mutations
*Faculty: Sohrab Shah*Lecture:
BiCG_2015_Module6.pdf
BiCG_2015_Module6.ppt
BiCG_2015_Module6.mp4
Lab Practical:
- Somatic Mutations Lab
- Software Installation for Somatic Mutations Lab
- Pre-processing Bams
- SNV Data Analysis Package
Links:
* Strelka
* MutationSeq
Day 4
Module 7: Gene Expression Profiling
*Faculty: Paul Boutros*Lecture:
BiCG_2015_Module7.pdf
BiCG_2015_Module7.ppt
BiCG_2015_Module7.mp4
Lab Practical:
Data Sets:
- For R3.0/Bioconductor2.13 - MAC/Linux - hgu95av2hsentrezgcdf_18.0.0.tar.gz - When saving this file, change to lower case h
- For R3.0/Bioconductor2.13 - PC - hgu95av2hsentrezgcdf_18.0.0.zip - When saving this file, change to lower case h
- For R3.1/Bioconductor3.0 - MAC/Linux - hgu95av2hsentrezgcdf_19.0.0.tar.gz - When saving this file, change to lower case h
- For R3.1/Bioconductor3.0 - PC - hgu95av2hsentrezgcdf_19.0.0.zip - When saving this file, change to lower case h
Links:
Module 8: Variants to Pathways
Part I: Annotation of somatic coding variants and Part II: From Gene Lists to Pathways
*Faculty: Daniele Merico*Lecture:
CBW_BiCG_2015_Module8_Part1_and_PartII.pdf
CBW_BiCG_2015_Module8_Part1_and_PartII.ppt
CBW_BiCG_2015_Module8_Part1_and_PartII.mp4
Part I Lab Practical: script (Annovar version March 2015)
Data Set: input (VCF):
Data Set: output (Annovar text table)
Data Set Output - Annovar text table
Lab Practical: extra info
Part II Lab Practical: protocol
Data Sets: Gene Lists Data Set Genelist GBM
Data Sets: Enrichment Results (g:Profiler) from Gene Lists
Data Sets: Enrichment Map (Cytoscape) from Enrichment Results
Day 5
Part III: Network Analysis using Reactome FI
*Faculty: Lincoln Stein and Robin Haw*Lecture:
BiCG_2015_Module8_Part3.pdf
BiCG_2015_Module8_Part3.ppt
Lab Practical:
BiCG_2015_Module8_Part3_LabSlides.pdf
BiCG_2015_Module8_Part3_LabExercise.pdf
BiCG_2015_Module8_Part3_LabAnswers.pdf
Reactome User Guide
ReactomeFI User Guide
Data Sets:
Data Set Genelist KIRC OVCA_TCGA_Clinical.txt
Papers:
Integrated genomic analyses of ovarian carcinoma
Clustering Algorithms: Newman Clustering and Hotnet
Reactome Website: NAR paper; Website guide
Nature Methods and Perspectives Paper
Links:
Pathway and Interaction databases
- GO
- KEGG
- Biocarta
- Reactome Curated human pathways
- NCI/PID
- Pathway Commons Aggregates pathways from multiple sources
- iRefWeb/iRefIndex Protein interactions
- >300 more
Tools for finding/converting gene identifiers and gene attributes
Cytoscape
Useful plugins:
- VistaClara - makes it easy to visualize gene expression data on networks
- Agilent Literature Search - extracts interactions from PubMed abstracts
- clusterMaker - provides multiple ways to cluster gene expression and networks
- BiNGO - provides over-representation analysis using Gene Ontology in Cytoscape - you can select genes in your network or provide a list of genes and see the enrichment results visually mapped to the Gene Ontology
- commandTool, coreCommands - used to control Cytoscape by a series of commands. E.g. automate the process: open network, layout network, save network as PDF. These plugins require Cytoscape 2.7
- jActiveModules - requires gene expression data over multiple samples (>3). Finds regions of a network where genes are active (e.g. differentially expressed) across multiple samples.
- EnrichmentMap
- ReactomeFI
- Many more
Special Guest Speaker: Dr. John Bartlett, Director of Transformative Pathology, Ontario Institute for Cancer Research
Guest Lecturer Biography - Dr. John Bartlett, Director Transformative Pathology, OICR
Guest Lecture Slides - This will be posted after the private data slides have been removed.
Module 9: Integration of Clinical Data
*Faculty: Anna Lapuk*Lecture:
BiCG_2015_Module9.pdf BiCG_2015_Module9.pdf
Lab Practical:
BiCG_2015_Module9_Lab.R Taylor et al. Paper - Integrative genomic profiling of human prostate cancer PMC3198787 Data Sets: Module 9 Data Files.zip Papers: Cox Regression Survival Paper.pdf
PMID17157792.pdf
PMID17157792
Supplementary
Data
Tools with installation instruction in our Amazon server
Data Sets from Entire Workshops
*
Module3
data
*
data
set
for
Module4,5,6
Results from Instructor’s Instance on Amazon
*
Module3
result
*
Module4
result
*
Module5
result
*
Module6
result
*
Module8
part
I
result
Launching CBW AMI
Steps to launch CBW public AMI
AMI ID: ami-b9a253d2 AMI Name: CBW workshops 2015