We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!
HC not calling variants at the edges after clipping probes

Hi,
I am facing a strange issue with GATK 3.4. I have a set of PE fastq files. I first ran the variant calling pipeline using the following steps:
bwa mem --> sam to bam --> mark duplicates using picard --> RealignerTargetCreator GATK 3.4 --> IndelRealigner --> HaplotypeCaller --> GenotypeGVCFs
Then I saw that the probe sequences that were used to design the region of interest (its a custom-designed panel) overlapped with the some variants of interest. So, the genotypes of some of these variants was 0/1 (due to probes seqs in those positions) instead of the the true 1/1 (I am using a gold standard sample).
So, then I clipped the first 27 bases of each read (the probes are all around that range 22-31 nts with 27 being the most common length). And then I ran the same above pipeline but some of the variants were not called in spite of all reads having those variants.
I am attaching a snapshot of the bam file alignment with the variant in orange in the center. Top track is the first case (before clipping the probes) where the probes are present in the upper reference block without the variant and the lower block with the insert that carry the variant and hence the 0/1 call.
https://us.v-cdn.net/5019796/uploads/editor/dv/fagssh9xfzxb.png
Lower track is after the clipping of the probe sequences and so only the insert with the variant are left. But the variant is still not called in the VCF file.
I read in another thread that this happens because the tool needs 50bp on either side to do proper reassembly. Is that correct? If so, how do I get around it? Should I not use the IndelRealignment? Any other suggestion to solve this issue?
Thanks a bunch in advance!!
Answers
Hi @nitinCelmatix,
did you take a look at the realignment which the HaplotypeCaller actually do? You can follow the instructions described in this thread to do so https://software.broadinstitute.org/gatk/documentation/article.php?id=5484 .
Can you please post an IGV-screenshot of the bamout for your site of interest? This would help a lot to track down your problem.
In addition, you can try the tricks which are described in this article: https://software.broadinstitute.org/gatk/documentation/article?id=1235
Greetings EADG
Thanks, EADG, I appreciate the quick response.
Yes, you are right - realignment from HC is way different for the sites where the variants are not being called. Here is a bamout snapshot of the site where the variants are called (top track original bam file and lower track HC's bamout after realignment):
https://us.v-cdn.net/5019796/uploads/editor/wt/ogtzgvmeizga.png
And here is a site where the variant was not called:
https://us.v-cdn.net/5019796/uploads/editor/f3/e2dajw7kb9w6.png
Makes sense why these variants were not called. But then my question is why are these alignments in the second case so different? It seems to me that the reads should stay wherever they are because they align there with near-perfect CIGAR. Is there a way to be able to call those variants?
Also, interestingly, the site of the lower case has a depth of 253 from the GVCF file:
chr2 43638185 . C <NON_REF> . . END=43638185 GT:DP:GQ:MIN_DP:PL 0/0:253:0:253:0,0,0
Please suggest! Thanks a bunch!
In other words, is it possible to turn off the realignment by HC during variant calling? Thanks!
Hi @nitinCelmatix ,
no, it is not possible to disable the realignment. It is one of his key features.
Did you create the bamout with
--disableOptimizations
and-forceActive
?Can you try rerun with GATK38 and
-newQual
flag?Greetings EADG
@nitinCelmatix
Hi,
I think your issue is related to a known issue. You can read more about it in this thread.
-Sheila