Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GetPileupSummaries fails if intersection of scatter intervals and variants is empty

myourshawmyourshaw University of ColoradoMember ✭✭
edited July 22 in Ask the GATK team

Using GATK 4 with this code from gatk4-somatic-snvs-indels/mutect2.wdl in task M2:

        if [[ ! -z "${variants_for_contamination}" ]]; then
            gatk --java-options "-Xmx${command_mem}m" GetPileupSummaries -R ${ref_fasta} -I ${tumor_bam} ${"--interval-set-rule INTERSECTION -L " + intervals} \
                -V ${variants_for_contamination} -L ${variants_for_contamination} -O tumor-pileups.table

            if [[ ! -z "${normal_bam}" ]]; then
                gatk --java-options "-Xmx${command_mem}m" GetPileupSummaries -R ${ref_fasta} -I ${normal_bam} ${"--interval-set-rule INTERSECTION -L " + intervals} \
                    -V ${variants_for_contamination} -L ${variants_for_contamination} -O normal-pileups.table
            fi
        fi

I get a an error when the intersection of the scatter interval list with the variant "intervals" is empty:

              A USER ERROR has occurred: Argument -L, --interval-set-rule has a bad value:
              [
                /gpfs/share/cmoco_sys_dev/nfs/storage/cromwell/cromwell-executions/somatic_snvs_indels/3e4da46a-2937-4abf-9ee5-1167c1ae80fc/call-Mutect2/shard-0/Mutect2/a33ec807-f6bb-451b-96d2-0c32fd7032b8/call-M2/shard-0/inputs/1309905945/0000-scattered.interval_list,
                /gpfs/share/cmoco_sys_dev/nfs/storage/cromwell/cromwell-executions/somatic_snvs_indels/3e4da46a-2937-4abf-9ee5-1167c1ae80fc/call-Mutect2/shard-0/Mutect2/a33ec807-f6bb-451b-96d2-0c32fd7032b8/call-M2/shard-0/inputs/290972759/small_exac_common_3.hg38.vcf.gz
              ],INTERSECTION. The specified intervals had an empty intersection,

The empty intersection is a result of running the workflow with a small gene panel. Because the task already pre-creates empty cross-contamination outputs, GetPileupSummaries should exit gracefully with no output rather than raising an error and causing entire workflow to abort.

Other than eliminating the cross-contamination step, is there a workaround for this issue?

Answers

  • bshifawbshifaw Member, Broadie, Moderator admin

    Mind providing the stderr and script files generated by cromwell.

  • myourshawmyourshaw University of ColoradoMember ✭✭

    Here are the requested stderr and script files (with ".txt" added to allow uploading).
    The files were in /gpfs/share/cmoco_sys_dev/nfs/storage/cromwell/cromwell-executions/somatic_snvs_indels/b5f46735-a134-4916-a30b-30737c31db79/call-Mutect2/shard-0/Mutect2/5ae26157-1921-45c7-bd26-e579aad2c89f/call-M2/shard-0/execution

    It is indeed the case that the intersection of the intervals and vcf is empty.

  • bshifawbshifaw Member, Broadie, Moderator admin

    Are you sure you want to continue with running the CrossContamination step, it seems like its unreliable when it comes to small gene panels and its recommended to run without it here?

    I wonder why the interval file is coming up empty, mind attaching the input and output interval files produced by the SplitInterval task? What are you using for the scattercount variable and what happens when its reduced?

    You may try removing set -e, this may prevent the task from causing the workflow from stopping.

Sign In or Register to comment.