ISB-CGC APIs

About

There are two Application Programming Interfaces (APIs) for interacting with ISB-CGC hosted data. The first API, recommended for interacting with Big Query Tables is the Google SDK. ISB-CGC also provides a Swagger API, intended for interacting with our webapp and data generated through our webapp via the command line.

The ISB-CGC APIs can also be used via Python and R. We have tutorial notebooks available in our Community Notebook Repository.

Some example use-cases that the ISB-CGC API is intended to address are:

  • Obtaining detailed metadata about a particular patient or sample
  • Creating (or retrieving a previously saved) cohort of patients and samples
  • Retrieving a cohort’s file manifest using the cohort ID or specific filters
  • Register, refresh, and unregister a specified Google Cloud Project

Note that all APIs calling user-generated data require identity credentials for use.

Authorization

Some of the APIs - such as the programs, samples, and cases - can be accessed without authorization. APIs that call on information saved in a users account, such as the cohorts and gcp APIs, necessarily require account authorization to access.

In order to access the APIs that require ISB-CGC authorization, you will need to generate a credentials file on your local machine or on your VM. To load your credentials into your command line interface:

  1. Clone the ISB-CGC scripts git repository to your local machine
  2. Run the isb_auth.py script either through the command line or within python
  3. If you are running the ISB-CGC APIs on a VM, upload the file generated by the above process

ISB-CGC API v4.0 UI Demo

The ISB-CGC API v4.0 UI can be used to see details about the syntax for each call, and also provides an interface to test requests.

To generate a subset of of ISB-CGC hosted data with your desired characteristics we have provided tools to generate cohorts of patients. In addition to the the BigQuery command line users may create and share cohorts using the ISB-CGC web-app and then access them using the Swagger UI API. (TCGA samples are easily subset by using the 16-character barcode, i.e. TCGA-B9-7268-01A, while patients are identified using the 12-character prefix of the sample barcode in this case TCGA-B9-7268. Other datasets such as CCLE may use other naming conventions).

Make a Request

As mentioned before, some of the API calls will require authentication - denoted by a small lock symbol - this can be done by using the ‘Authorize’ button at the top right of the page. For a quick demonstration of the syntax of an API call one can test the POST/samples request. This API request has the following syntax:

{
 "barcodes": [
 <barcode 1>,
 <barcode 2>,
 ...,
 <barcode n>,
 ]
}

The value in the Request Body field can be edited by selecting ‘Try it out’. One can change the default sample names / barcodes or simply leave the default ones. The request can be run by selecting ‘Execute’.

Request Response

Swagger UI submits the request and shows the curl code that was submitted. The ‘Response body’ section will display the response to the request. The expected format of the response for the above request is shown below:

{
 "data": [
 {
  "samples": [
    {
         "data_details": [
           {
             <key 1>: <value 1>,
             <key 2>: <value 2>,
             ...,
             <key n>: <value n>,
           }
         ],
         "biospecimen_data": {
           <key 1>: <value 1>,
           <key 2>: <value 2>,
           ...,
           <key n>: <value n>,
         },
         "sample_barcode": "string",
         "case_barcode": "string"
       }
     ]
   }
 ],
 "code": 0,
 "barcodes_not_found": [
   "string"
 ],
 "total_found": 0,
 "notes": "string"
}

The JSON formatted response can be downloaded by selecting the ‘Download’ button. We provide API calls that allow for calls pertaining to specific samples, cases, files, cohorts, and users. The syntax for all of these is available on the ISB-CGC API v4.0 UI webpage. For any questions or feedback on the API, please do not hesitate to contact us at feedback@isb-cgc.org.

Nuances when using the APIs

  • Any special characters in the input field will cause the request to fail. e.g. spacing in input box.
  • Please make sure to delete all fields not being used.
  • Case barcode centric requests only pull file paths specific to case entries.
  • Sample centric requests pull file paths specific to sample entries.
  • Cohorts made in CloudSQL (web app) will differ in sample counts from cohorts made with BigQuery tables (APIs). Samples which correspond to pathology slide images are available in the CloudSQL tables but not currently in the BigQuery tables.

Have feedback or corrections? Please email us at feedback@isb-cgc.org.