Running CWL pipeline on public Bam file from ISB-CGC

This workflow gathers GC content from a BAM file (or a list of BAM files) to a text file.

Requirements:

  • Docker

  • Gcsfuse

  • CWLtool (CWL)

  • A public bam file from ISB-CGC at the address: gs://gdc-ccle-open/692a845c-7957-41f2-b679-5434c69ba25b/G27328.Calu-6.1.bam

To install Docker and CWL, see our VM Workflow Tools Installation Cheatsheet for instructions. To set up gcsfuse in order to get access to the BAM file, please visit Running Workflow with GCSFUSE.

Note

The requirements above are crucial to running this workflow. Please make sure you have them installed properly prior to running this workflow.

Download this tutorial:

$ sudo add-apt-repository universe
$ sudo apt update
$ sudo apt install subversion

#cloning this tutorial
$ svn checkout https://github.com/isb-cgc/RunningWorkflows-on-the-GoogleCloud/trunk/CWL-GCgather

Running CWLtool

You should have a CWL-GCgather directory with 6 files inside: 1 main workflow file “CWL-GCgather.cwl”, 4 *tools.cwl and . We are going to change the address in scatter_gather_pipeline.yml file to the one you created in the Running Workflow with GCSFUSE tutorial

#go into the folder
$ cd CWL-GCgather
$ nano scatter_gather_pipeline.yml

At the top of the file you will see this:

filein:
  - {class: File, path: /opt/testGcsfuse/G27328.Calu-6.1.bam}

Replace “/home/thinh_vo/testGcsfuse/G27328.Calu-6.1.bam” with your new address from the gcsfuse tutorial for example: “/home/thinh_vo/testGcsfuse/G27328.Calu-6.1.bam”. Now the script is ready to run with CWLtool. Save the change, then run the script with this command:

$ cwltool CWL-GCgather.cwl scatter_gather_pipeline.yml

If you receive this error: “docker: Got permission denied while trying to connect to the Docker daemon socket at unix”

Try:

$ sudo groupadd docker
$ sudo usermod -aG docker ${USER}
close and reopen VM then run the script again

Note

This Bam file is quite large, it may take about 15 mins ~ 20 mins to run.

Once CWLtool is finished, the result will be in the same folder called “final_output.txt”

To see the result of this workflow, you can check it here


Have feedback or corrections? Please email us at feedback@isb-cgc.org. Follow us on BlueSky and X!