TCGA Radiology and Pathology Image Data Set

The TCGA images from The Cancer Imaging Archive (TCIA) as well as the pathology and diagnostic images previously available from the Cancer Digital Slide Archive (CDSA) are available in open-access Google Cloud Storage (GCS) buckets and can be explored through the ISB-CGC Web App.

Metadata for these files can be found in ISB-CGC Google BigQuery tables. Information about these tables can be found using the ISB-CGC BigQuery Table Search with TCGA selected for filter PROGRAM and FILE METADATA selected for filter CATEGORY. To learn more about this tool, see the ISB-CGC BigQuery Table Search documentation.

Radiology Images

Over 1.4 million radiology image files in DICOM format, grouped together into over 20,000 ZIP files are available in a GCS bucket called gs://isb-tcia-open/. Each ZIP file may contain hundreds of images or just a single image.

The BigQuery metadata table, isb-cgc-bq.TCGA.radiology_images_tcia_current contains the full URLs to these ZIP files, e.g.:

gs://isb-tcia-open/images/TCGA-GBM/TCGA-06-5413/TCIA.image.1.3.6.1.4.1.14519.5.2.1.4591.4001.275342915307453440215680715165.zip

The metadata table also includes the patient identifier in TCGA “barcode” format, e.g. TCGA-06-5413 (which is also part of the GCS URL). Other information available in the table includes the body part examined, image modality, patient age, etc.

Pathology Images

Note

All tissue slide images from the TCGA program are currently unavailable for viewing. (Diagnostic images will display.)

Over 30,000 TCGA tissue slide images in SVS format, are also available in GCS, in the open-access bucket gs://gdc-tcga-phs000178-open/.

These files were uploaded from the GDC legacy archive.

The BigQuery metadata table, isb-cgc-bq.TCGA.slide_images_gdc_current contains the full URLs to these SVS files, e.g.:

gs://gdc-tcga-phs000178-open/9c4b1b5c-b5cf-48f6-bf41-047ceb8c883c/TCGA-CR-7365-01A-01-TS1.811bb2b7-66e3-4694-891b-10b436ec300d.svs

as well as image metadata and the TCGA case and sample “barcode” which can be used to join this table with other TCGA clinical, biospecimen and molecular data tables.


Have feedback or corrections? Please email us at feedback@isb-cgc.org.