logo

Bioinformatics for Cancer Genomics 2016

Workshop pages for students



Course Schedule

Schedule for May 30 to June 3, 2016

Workshop Q/A Forum

Post your workshop questions here!

Workshop Survey

We appreciate your feedback on your experience at the workshop. Please complete our survey at the end of the workshop.

Laptop Setup Instructions

Instructions to setup your laptop can be found here.

Difference Between R and RStudio


RStudio doesn’t know where libraries are installed, when they are not installed through the RStudio package manager. To tell RStudio the location, you can define the path in a startup file. Create a file called .Renviron . Inside there:

R_LIBS=<R Library Path of other installed packages>

That was the problem when students installed things in RStudio at the command line using the R command install.package().

… or you could use the package manger to install libraries.

Syntax highlighting


… of scripts in the R editor does not seem to work under Windows. If you want highlighted syntax, use RStudio instead.

Pre-Workshop Tutorials

1) R Preparation tutorials: You are expected to have completed the following tutorials in R beforehand. The tutorial should be very accessible even if you have never used R before.

2) Cytoscape 3.x Preparation tutorials: Complete the introductory tutorial to Cytoscape 3.x:

  • Introduction to Cytoscape3 - User Interface
  • Introduction to Cytoscape3 - Welcome Screen
  • Introduction to Cytoscape 3.1 - Networks, Data, Styles, Layouts and App Manager

3) UNIX Preparation tutorials: Please complete tutorials #1-3 on UNIX Tutorial for Beginners

Pre-workshop Readings

Before coming to the workshop, read these.

Database resources of the National Center for Biotechnology Information

COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer

Integrative genomic profiling of human prostate cancer

Predicting the functional impact of protein mutations: application to cancer genomics

Cancer genome sequencing study design

Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy

The UCSC Genome Browser database: extensions and updates 2013

Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration

Feature-based classifiers for somatic mutation detection in tumour–normal paired sequencing data

Expression Data Analysis with Reactome

Logging into the Amazon Cloud

Instructions can be found here.

  • We have set up 30 instances on the Amazon cloud - one for each student. In order to log in to your instance, you will need a security certificate. If you plan on using Linux or Mac OS X, please download this certificate. Otherwise if you plan on using Windows (with Putty and Winscp), please download this certificate.

Class Photo

Class Picture Link to Download Class Photo

YouTube Playlist for Recorded Lectures

Recorded Lectures’ Playlist


Day 1

Welcome

Ann Meyer


Module 1: Introduction to Cancer Genomics

Trevor Pugh

Lecture

Recorded Lecture


Module 2.1: Databases and Visualization Tools

Michelle Brazas and Florence Cavalli

Lecture

Recorded Lecture

Lab practical for ICGC

Lab practical for IGV


Module 2.2: Logging into the Cloud

Francis Ouellette

Lecture


Optional R Review Session

Florence Cavalli

Lecture

R commands


Day 2

Module 3: Mapping and Genome Rearrangement

Jared Simpson

Lecture

Recorded Lecture

Lab practicals: Part 1 - Mapping and Part 2 - Rearrangements


Module 4: Gene Fusion Discovery

Andrew McPherson

Lecture and Lab

Recorded Lecture

Lab practical:

Papers and Background Material:


Day 3

Module 5: Copy Number Alterations

Sohrab Shah and Fong Chun Chan

Lecture

Lab practical

  • Lab Module
    • This is the instructions for the lab practical.
  • Data Analysis Package
    • Contains the various files and Rmarkdown file that will be used to do further exploration and analysis on copy number alterations.
    • This is package is already on the server. You can also download this to your own computer and perform the analyses locally.
  • Software Installation
    • This page contains information on how to install the different software used in the lab practical.
  • Data Preparation
    • This page contains information on how the data was prepared to be used for lab practical.

Data for Lab Practical

Plots for Lab Practical

These plots are provided for convenience. They can be generated by following the lab practical.


Module 6: Somatic Mutations

Sohrab Shah and Fong Chun Chan

Lecture

Recorded Lecture

Lab practical

  • Lab Module
    • This is the instructions for the lab practical.
  • Data Analysis Package
    • Contains the various files and Rmarkdown file that will be used to do further exploration and analysis on somatic mutations data.
    • This is package is already on the server. You can also download this to your own computer and perform the analyses locally.
  • Data Preparation
    • This page contains information on how the data was prepared to be used for lab practical.
  • Pre-processing Bams
    • This page contains information on how to pre-process bam (e.g. filtering) for downstream analyses.


Day 4

Module 7: Gene Expression Profiling

Fouad Yousif

Lecture

Recorded Lecture

Lab practical with answers


Module 8: Variants to Networks

Part 1: How to annotate variants and prioritize potentially relevant ones

Robin Haw

Lecture

Recorded Lecture

Lab practical

Data Set Input - VCF

Data Set Output - Annovar text table

Lab Practical extra info

Annovar


Part 2: From genes to pathways

Juri Reimand

Lecture

Lab practical protocol

Data Sets Gene Lists:

Data Set Genelist GBM

Data Set Genelist KIRC

Data Sets Enrichment Results (g:Profiler) from Gene Lists:

gProfiler Results GBM

gprofiler Results KIRC

gProfiler hsapiens

Data Sets Enrichment Map (Cytoscape) from Enrichment Results:

EM cys

Enrichmentmap

Enrichment Map


Day 5

Part 3: Network Analysis using Reactome

Robin Haw

Lecture

Recorded Lecture

Lab practical and Answers

Data Sets:

OVCA_TCGA_Clinical.txt

OVCA_TCGA_GeneList.txt

OVCA_TCGA_MAF.txt

Papers:

Integrated genomic analyses of ovarian carcinoma

Clustering Algorithms: Newman Clustering and Hotnet

Reactome Website: NAR paper; Website guide

Nature Methods and Perspectives Paper

Supplementary Materials

Tools for finding/converting gene identifiers and gene attributes

Cytoscape

Useful plugins:

  • VistaClara - makes it easy to visualize gene expression data on networks
  • Agilent Literature Search - extracts interactions from PubMed abstracts
  • clusterMaker - provides multiple ways to cluster gene expression and networks
  • BiNGO - provides over-representation analysis using Gene Ontology in Cytoscape - you can select genes in your network or provide a list of genes and see the enrichment results visually mapped to the Gene Ontology
  • commandTool, coreCommands - used to control Cytoscape by a series of commands. E.g. automate the process: open network, layout network, save network as PDF. These plugins require Cytoscape 2.7
  • jActiveModules - requires gene expression data over multiple samples (>3). Finds regions of a network where genes are active (e.g. differentially expressed) across multiple samples.
  • EnrichmentMap
  • ReactomeFI
  • Many more


Module 9: Integration of Clinical Data

Anna Goldenberg and Lauren Erdman

Lecture

Recorded Lecture

Updated Lab

RData

R Commands

Tools:

Predict tool

Papers:

Gene expression profiling reveals molecularly and clinically distinct subtypes of glioblastoma multiforme

Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1

Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis

Similarity network fusion for aggregating data types on a genomic scale


Data for the Workshop

Tool Installation

Instructions for installing the tools used in the workshops can be found here.

Data Sets

Results from Instructor’s Instance on Amazon

View on GitHub