Teaching Sessions and Workshops
NIH Library Session - October 14th, 2021
We offered a half-day online bioinformatics workshop in collaboration with the NIH Library on October 14th, 2021. This workshop included a two hour interactive data science and bioinformatics component using the R statistical language and Google Cloud (BigQuery) to explore NCI genomic and proteomics (TCGA) datasets. The following outline of the interactive workshop links to Jupyter notebooks used during the training. These notebooks can be executed in Google Colab or other Jupyter environments.
Exploration of BigQuery Datasets
Select age of cases in a project
Characterize expression of a single gene in normal and tumor samples
Find annotation information for a gene
Find and plot all mutations in and around a gene
Join mutation data with survival data
Introduction to BigQuery Machine Learning
Train a classifier that uses expression profiles to predict tumor status
Evaluate the classifier
Use the classifier to predict tumor status of future samples