ISB-CGC Notebooks

What’s a notebook?

Notebooks provide an interface to an interactive analysis environment. They are a mix of code (usually R or Python), descriptive explanations, and visualizations. They’re often used to demonstrate an analysis in a step by step fashion. We provide a set of notebooks below as tutorials for several frequently run analyses. You can run these through Jupyter Lab, R Studio, or Google Colaboratory.

I’m a novice, how do I…

Get started fast? Python R
Find GDC file locations? Python R
Plot a BigQuery result? Python R
Plot a heatmap using data in BigQuery? Python R
Work with cloud storage? Python  
Create cohorts of patients? Python R
Use PyPika or dbplyr to build a query? Python R
Create a complex cohort? Python R
Join multiple tables? Python  
Get started working with the COSMIC datasets? Python  
Convert a .bam file to a .fastq file with samtools? Python  
Find a GA4GH Tool Repository Service (TRS) tool? Python  
Run workflow execution service (WES) tools? Python  
Use the ISB-CGC APIs? Python R

I’m an advanced user, how do I…

Make a BigQuery table from an NCBI GEO data set? Python  
Compare cohorts with survival analysis and feature comparison? Python R
Run an ANOVA with BigQuery?* Python R
Score gene sets in BigQuery?* Python R
Correlate gene expression and copy number variation? Python  
Compute gene-gene expression correlation using BigQuery? Python  
Create randomized subsets of patients using BigQuery? Python R
Convert a 10X scRNA-seq bam file to fastq with dsub? Python  
Quantify 10X scRNA-seq gene expression with Kallisto and BUStools? Python  
Compute Nearest Centroid Classification using BigQuery? Python R
Analyze data in the COSMIC Cancer Gene Census dataset? Python  
Use a BigQuery user defined function to perform k-means clustering? Python  
Explore CPTAC protein abundances? Python  

*Notebook inspired by a Query of the Month Blog post


Have feedback or corrections? Please email us at feedback@isb-cgc.org.