ISB-CGC Notebooks

What’s a notebook?

Notebooks provide an interface to an interactive analysis environment. They are a mix of code (usually R or Python), descriptive explanations, and visualizations. They’re often used to demonstrate an analysis in a step by step fashion. We provide a set of notebooks below as tutorials for several frequently run analyses. You can run these through Jupyter Lab, R Studio, or Google Colaboratory.

I’m a novice, how do I…

How do I get started fast? Python R
How to find GDC file locations? Python R
How do I plot a BigQuery result? Python R
How do I plot a heatmap using data in BigQuery? Python R
How do I work with cloud storage? Python  
How do I create cohorts of patients? Python R
How to use PyPika or dbplyr to build a query? Python R
How do I create a complex cohort? Python R
How do I join multiple tables? Python  
How do I get started working with the COSMIC datasets? Python  
How do I convert a .bam file to a .fastq file with samtools? Python  
How do I find a tool using the GA4GH Tool Repository Service (TRS)? Python  
How do I run a tool using a workflow execution service (WES)? Python  
How do I use the ISB-CGC APIs? Python R

I’m an advanced user, how do I…

How do I make a BigQuery table from an NCBI GEO data set? Python  
How do I quickly compare cohorts with survival analysis and feature comparison? Python R
How do I run an ANOVA with BigQuery?* Python R
How do I score gene sets in BigQuery?* Python R
How do I correlate gene expression and copy number variation? Python  
How do I compute gene-gene expression correlation using BigQuery? Python  
How do I create randomized subsets of patients using BigQuery? Python R
How do I convert a 10X scRNA-seq bam file to fastq with dsub? Python  
How do I quantify 10X scRNA-seq gene expression with Kallisto and BUStools? Python  
How do I do Nearest Centroid Classification using BigQuery? Python R

*Notebook inspired by a Query of the Month Blog post


Have feedback or corrections? Please email us at feedback@isb-cgc.org.