HaplotyeCaller - non-variant block records in gVCF

fiapintofiapinto PortugalMember
edited February 2016 in Ask the GATK team


I have generated a gVCF for an exome (with non-variant block records) from a BAM file belonging to the 1000Genomes data.
I am using GATK tools version 3.5-0-g36282e4 and I have run the HaplotypeCaller as follows:

time java -jar $gatk_dir/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R $reference \
-I $bamfile \
-ploidy 2 \
-stand_call_conf 20 \
-stand_emit_conf 10 \
-o output.g.vcf.gz

Within the purpose of the analysis I am performing, from this gVCF I need to be able to know whether the positions are no-called, homozygous reference, variant sites or if the positions were not targeted in the exome sequencing.

However, with the gVCF file I obtained I am not able to do it because there are only variant site records or non-variant block records where the GT tag is always "0/0".

So I have few questions regarding the non-variant block records:

  1. Why the output file does not contain any no-call ("./.") record?

  2. Shouldn't regions where there are no reads have the tag GT equal to "./." instead of "0/0"?

  3. How can regions without reads (not targeted) be distinguished from regions with reads that were not called?

  4. When looking at the bam file with IGV, non-variant blocks displayed in gVCF contain regions with reads. What is the explanation for such behaviour?

Thank you for your attention,


Best Answer


  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @fiapinto @Redmar_van_den_Berg

    @Redmar_van_den_Berg Thanks for your fantastic answer! You are right about everything :smile:

    Sofia, I just want to add that if you run GenotypeGVCFs on your GVCFs, you will find the sites that don't have reads or have GQ = 0, will change to no-calls.


Sign In or Register to comment.