Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

GetPileupSummaries fails if intersection of scatter intervals and variants is empty

myourshawmyourshaw University of ColoradoMember ✭✭
edited July 22 in Ask the GATK team

Using GATK 4 with this code from gatk4-somatic-snvs-indels/mutect2.wdl in task M2:

        if [[ ! -z "${variants_for_contamination}" ]]; then
            gatk --java-options "-Xmx${command_mem}m" GetPileupSummaries -R ${ref_fasta} -I ${tumor_bam} ${"--interval-set-rule INTERSECTION -L " + intervals} \
                -V ${variants_for_contamination} -L ${variants_for_contamination} -O tumor-pileups.table

            if [[ ! -z "${normal_bam}" ]]; then
                gatk --java-options "-Xmx${command_mem}m" GetPileupSummaries -R ${ref_fasta} -I ${normal_bam} ${"--interval-set-rule INTERSECTION -L " + intervals} \
                    -V ${variants_for_contamination} -L ${variants_for_contamination} -O normal-pileups.table
            fi
        fi

I get a an error when the intersection of the scatter interval list with the variant "intervals" is empty:

              A USER ERROR has occurred: Argument -L, --interval-set-rule has a bad value:
              [
                /gpfs/share/cmoco_sys_dev/nfs/storage/cromwell/cromwell-executions/somatic_snvs_indels/3e4da46a-2937-4abf-9ee5-1167c1ae80fc/call-Mutect2/shard-0/Mutect2/a33ec807-f6bb-451b-96d2-0c32fd7032b8/call-M2/shard-0/inputs/1309905945/0000-scattered.interval_list,
                /gpfs/share/cmoco_sys_dev/nfs/storage/cromwell/cromwell-executions/somatic_snvs_indels/3e4da46a-2937-4abf-9ee5-1167c1ae80fc/call-Mutect2/shard-0/Mutect2/a33ec807-f6bb-451b-96d2-0c32fd7032b8/call-M2/shard-0/inputs/290972759/small_exac_common_3.hg38.vcf.gz
              ],INTERSECTION. The specified intervals had an empty intersection,

The empty intersection is a result of running the workflow with a small gene panel. Because the task already pre-creates empty cross-contamination outputs, GetPileupSummaries should exit gracefully with no output rather than raising an error and causing entire workflow to abort.

Other than eliminating the cross-contamination step, is there a workaround for this issue?

Answers

  • bshifawbshifaw Member, Broadie, Moderator admin

    Mind providing the stderr and script files generated by cromwell.

  • myourshawmyourshaw University of ColoradoMember ✭✭

    Here are the requested stderr and script files (with ".txt" added to allow uploading).
    The files were in /gpfs/share/cmoco_sys_dev/nfs/storage/cromwell/cromwell-executions/somatic_snvs_indels/b5f46735-a134-4916-a30b-30737c31db79/call-Mutect2/shard-0/Mutect2/5ae26157-1921-45c7-bd26-e579aad2c89f/call-M2/shard-0/execution

    It is indeed the case that the intersection of the intervals and vcf is empty.

  • bshifawbshifaw Member, Broadie, Moderator admin

    Are you sure you want to continue with running the CrossContamination step, it seems like its unreliable when it comes to small gene panels and its recommended to run without it here?

    I wonder why the interval file is coming up empty, mind attaching the input and output interval files produced by the SplitInterval task? What are you using for the scattercount variable and what happens when its reduced?

    You may try removing set -e, this may prevent the task from causing the workflow from stopping.

Sign In or Register to comment.