Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

Het calls from HaplotypeCaller with no ALT reads

kevyinkevyin Posts: 10Member
edited January 22 in Ask the GATK team

Hi

With HaplotypeCaller,Version=2.8-1-g932cd3a

In the raw VCF we are seeing instances of

GT:AD:DP:GQ:PL 0/1:22,0:22:53:53,0,2067

A HET call (0/1) but there are no ALT reads (22,0)

Some Questions to clarify

a) Is this VCF format valid?

b) Is this intended or bug?

c) Are there current process that would filter these entries?

i) We've tried SelectVariants with --ExcludeNonVariants but the entries are still there

ii) Would downstream steps such as variant recalibration/filter catch these?

More Technical details:

This was run using Queue,

I've attached the VCF, I'm assuming the Command line information in the VCF header is sufficient?

Regions of interest (I've been greping for ",0:" )

chr2 212543723

chr3 10088443

If this is a bug. Let me know if you would like me to provide some bams

Thanks

gz
gz
project.RESEARCH_Mark_DNACapture.hc.snps.indels.vcf.tar.gz
6M
Post edited by kevyin on
Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,858Administrator, GATK Developer admin

    Hi @kevyin,

    Have you looked at the pileup of reads at those positions? It is possible that some reads supporting the allele are present but not counted in the genotype field due to filtering or soft-clipping. You should look at the site in IGV, with the option to show soft-clipped reads activated (this is very important).

    If there is still no evidence for the calls then we would need a snippet of the bam file that reproduces the error, in order to debug this locally. But first we want to see a screenshot of the site showing that there is no supporting data.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.