OHSU Data Set¶
About the Oregon Health & Science University¶
The Oregon Health & Science University contains data generated from chronic neutrophilic leukemia (CNL), atypical chronic myeloid leukemia (aCML), and unclassified myelodysplastic syndrome/myeloproliferative neoplasms (MDS/MPN-U), which are a group of rare, heterogeneous myeloid disorders.
About the Oregon Health & Science University Data¶
The data set consists of whole-exome and RNA sequencing on a cohort of over 100 cases of these rare hematologic malignancies. It presents the complete survey of the genomic landscape of these diseases to date. The Project ID in the GDC Data Portal is OHSU-CNL.
For more information on the OHSU data, please refer to these sites:
Accessing the Oregon Health & Science University Data on the Cloud¶
Besides accessing the files on the GDC Data Portal, you can also access them from the GDC Google Cloud Storage Bucket, which means that you don’t need to download them to perform analysis. ISB-CGC stores the cloud file locations in tables in the
isb-cgc-bq.GDC_case_file_metadata data set in BigQuery.
- To access these metadata files, go to the Google BigQuery console.
- Perform SQL queries to find the OHSU files. Here is an example:
SELECT active.*, file_gdc_url FROM `isb-cgc-bq.GDC_case_file_metadata.fileData_active_current` as active, `isb-cgc-bq.GDC_case_file_metadata.GDCfileID_to_GCSurl_current` as GCSurl WHERE program_name = 'OHSU' AND active.file_gdc_id = GCSurl.file_gdc_id
Accessing the OHSU Data in Google BigQuery¶
ISB-CGC has OHSU data, such as clinical, stored in Google BigQuery tables. Information about these tables can be found using the ISB-CGC BigQuery Table Search with OHSU selected for filter PROGRAM. To learn more about this tool, see the ISB-CGC BigQuery Table Search documentation.
The OHSU tables are in project isb-cgc-bq. To learn more about how to view and query tables in the Google BigQuery console, see the ISB-CGC BigQuery Tables documentation.
- Data set
isb-cgc-bq.OHSUcontains the latest tables for each data type.
- Data set
isb-cgc-bq.OHSU_versionedcontains previously released tables, as well as the most current table.