DIY Workshop

These materials were originally created for in-person workshops, and have been modified and updated to create a “Do It Yourself” workshop that you should be able to work through on your own. If you run into problems please send email to

Step #1: Setting up Your Local Environment

Your Google Identity

You may already have a Google identity – your institutional email may be a Google identity (if your institution uses Google Apps), or you may have a personal GMail address. One way to check whether your email address is a Google-managed identity is to go to the password assistance page, select “I don’t know my password” and enter your email address. If you get a response like “Please contact your domain IT administrator” then your email address is not a Google identity.

If you dont’ have a Google identity, it only takes a minute to create one.

Installing the Google Cloud SDK

The Google Cloud SDK is an essential toolbox for anyone working with the Google Cloud Platform. The Cloud SDK is easy to install and runs on Linux, Mac OS X, and Windows. It includes all of the command line tools, local emulators, and libraries that you will need. There are three key command line interfaces (CLIs) that you’ll want to become comfortable using:

  • gcloud enables seamless local authentication and powerful command line access to many cloud resources
  • gsutil lets you access Google Cloud Storage (GCS) from the command line
  • bq provides access to BigQuery from the command line

Once you have the gcloud SDK installed, you can find out what your current/default Project ID is by running gcloud config list from the command line. To initialize your default configuration, run gcloud init <>_ and follow the instructions.

Updates to the SDK are published every week or two, so you will frequently see a message that says:

Updates are available for some Cloud SDK components.  To install them, please run: $ gcloud components update.

When you see this message, simply run gcloud components update at your convenience, and follow the instructions.

Installing Chrome

If you do not already use the Chrome browser, we strongly suggest that you install Google Chrome on your laptop or desktop. Although the ISB-CGC web-app should work on any modern browser, it is optimized for the Chrome browser.

Installing R and RStudio

If you want to be able to run R scripts locally, you will want to install R as well as the interactive environment RStudio. You can follow these tips to get started.

Step #2: Setting up Your Google Cloud Platform (GCP) Project

Creating / Obtaining your GCP Project

In order to make use of all of the data, tools, and functionality described in this workshop, you will also need your own GCP project.

We’d like to encourage you to take advantage of the free trial offered by Google. If you have already used this one-time offer (or there is some other reason you cannot use it) please see the information here about requesting an ISB-CGC provided (and funded) project. (We’ll also be happy to do that for you after you use the $300 Google credit / free trial.)

Google Cloud Platform Console

The Google Cloud Platform Console (which we will refer to from now on simply as the Console) is your web-based interface to your GCP Project. From the Console, you can check the overall status of your project, create and delete Cloud Storage buckets, upload and download files, spin up and shut down VMs, add members to your project, etc. No setup or installation are required.

  • sign into your Chrome (or other) browser using your Google identity (the one associated with the GCP project that you created yourself or that we set up for you)
  • go to the Google Cloud Platform Console
    • you should automatically be signed in to your own GCP project;
    • in the top blue bar, towards the right, you may be able to select between two or more projects;
    • in the GCP Console, if you click on Home you will see your current Project ID on the Dashboard
    • this Quick Tour of the Google Cloud Console will help you learn the basics that you are most likely to need

NOTE: If you’re just getting started working in the Google Cloud, you will probably only have one project. Over time, however, you may find that it is useful to create additional projects for any of a variety of reasons. You may have different grants or contracts that need to be charged for specific research activities, or you may have different groups of collaborators that you are working with, or you may be working with different sets of controlled-access data. All of these are good reasons to set up multiple, separate, GCP projects. When you do so, however, you will need to learn to pay attention to which project is your “current” project. Any costs that you may incur, will alwasy be charged to your current project. The types of actions that incur costs include uploading data to a storage bucket, spinning up a VM, running a BigQuery query, etc.

  • If you are using the Console, you will see the Project Name in the blue bar at the top of the page, and the browser url should look like:<project-id>.
  • At the command-line, you can use the gcloud tool to verify your current configuration (as described above).
  • Finally, if you are using the BigQuery Web UI, the url should look like this:
    •<project-id> or

Enabling Required Google APIs

To make use of all of the functionality described in these tutorials (including running the example code available on github), you will need to have certain APIs enabled for your GCP project. Specifically, you will need the following to be enabled (some may already be enabled by default):

  • Google Compute Engine
  • Google Genomics
  • Google BigQuery
  • Google Cloud Logging
  • Google Cloud Pub/Sub

This tutorial will walk you through the steps involved in enabling new APIs for your project.

ISB Cancer Genomics Cloud (ISB-CGC)

  • ISB-CGC Web App & API Endpoints

Other Topics

DREAM Challenge: Somatic Mutation Challenge – RNA

Google Genomics

Have feedback or corrections? You can file an issue here or email us at