We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

'dot' value of DP in vcf file

I am trying to interpret DP value which is equal to 'dot' as in the vcf record. It appears to call heterozygous call for the sample and there are AD values but . DP value. Would you advise what it means?

4 191003240 . G T 182.87 . AC=1;AF=0.250;AN=4;BaseQRankSum=-1.422e+00;DP=66;FS=0.000;GQ_MEAN=50.00;GQ_STDDEV=70.71;MLEAC=2;MLEAF=0.500;MQ=39.14;MQ0=0;MQRankSum=-5.730e-01;NCC=0;QD=20.32;ReadPosRankSum=-2.480e-01;SOR=1.609 GT:AD:DP:GQ:PL 0/1:3,6:.:99:209,0,100 0/0:45,0:45:0:0,0,1135

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @dayzcool
    Hi,

    We have heard reports of this issue. Can you confirm you are using the latest version of GATK? Can you also post the exact commands (HaplotypeCaller, CombineGVCFs, GenotypeGVCFs) you used to get this result.

    Thanks,
    Sheila

  • dayzcooldayzcool Member

    It is an output from 3.2-2. Is it a known issue for the version?

  • SteveLSteveL BarcelonaMember ✭✭

    Hope this is a similar enough issue to original poster.

    Using GATK3.3 HC->through VQSR I have a few positions that look like the following:

    6 32487427 . A G ... 0/1:0,0:.:57:0|1:32487420_T_C:120,0,57

    Does this mean that this variant is ALWAYS assumed to be in phase with the 32487427 position, and thus even though we have no depth and 0 allele support, we can still reliably call a heterozygote at this position (assuming we are happy with a GQ of 57)?

    As far as I can tell, these only occur in positions where a haplotype is described, and I don't see any positions like those described by original poster amongst my ~100 exome samples.

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @dayzcool
    Hi,

    It is an issue that has been brought up, but I'm not sure a fix has been made. Can you post the commands you used? Also, can you post the IGV screenshots for the original bam and bamout files? I'm wondering if this is simply an issue of AD values being reported before reassembly.

    Thanks,
    Sheila

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @SteveL
    Hi,

    This is a known issue of representation around indels. Can you also please post the IGV screenshot of the bam file and bamout file for that region (including the phased position)?

    Thanks,
    Sheila

  • dayzcooldayzcool Member

    Thanks for your help! In the picture upper track is a bam file that is an input for the HaplotypeCaller. Lower track is bamout. Note that I have configured to output all the haplotypes not just called ones.

    HaplotypeCaller (GATK 3.2-2) command:
    'java' '-cp' 'org.broadinstitute.gatk.engine.CommandLineGATK' '-T' 'HaplotypeCaller' '-I' 'sample.bam' '-rf' 'BadCigar' '-L' 'homo.sapiens.GRCh37.chrs.intervals' '-R' 'Homo_sapiens_assembly19.fasta' '-dt' 'BY_SAMPLE' '-dcov' '750' '-variant_index_type' 'LINEAR' '-variant_index_parameter' '128000' '-o' 'sample.vcf' '-D' 'dbsnp_138.b37.vcf' '-A' 'MappingQualityRankSumTest' '-A' 'ReadPosRankSumTest' '-A' 'RMSMappingQuality' '-ERC' 'GVCF' '-stand_call_conf' '20' '-stand_emit_conf' '10' '-mbq' '20' '-pairHMM' 'VECTOR_LOGLESS_CACHING'

    GenotypeGVCFs (GATK 3.3-0) command:
    java -jar GenomeAnalysisTK.jar -R Homo_sapiens_assembly19.fasta -T GenotypeGVCFs --variant sample1.vcf.gz --variant sample2.vcf.gz -o twosamples.genotyped.vcf

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @dayzcool
    Hi,

    We do not recommend using downsampling arguments in Haplotype Caller. Can you please try running again without -dcov and -dt? Also, please try the latest version available here: https://www.broadinstitute.org/gatk/download/

    -Sheila

  • dayzcooldayzcool Member

    Thank you for your prompt help! I will try without the arguments.
    By the way, I have several questions in the context.
    1) Would you advise why we need not to use down sampling arguments?
    2) How does downsampling affect DP/AD values?
    3) I read in the document that site level DP (INFO field) is filtered depth. Is that true for HaplotypeCaller? It looks to be same value as in sample level DP.
    4) How can I understand alt. alleles with no AD value (AD = 0) in single sample variant call? Does it mean that alternative allele is called but no read is assigned to the allele?

    Some examples are copied below:
    17 39254142 rs375280428 A G,ACAGCAGCTGGAGATGCAGCATCTGGGGCGG,AGCAGCTGGAGATGCAGCATCTGGGGCGG, 311.73 PASS DB;DP=19;MLEAC=0,1,1,0;MLEAF=0.00,0.500,0.500,0.00;MQ=45.36;MQ0=0 GT:AD:GQ:PL:SB 2/3:0,0,11,0,0:4:775,568,606,116,4,67,382,396,0,370,502,485,6,347,465:0,0,0,0
    1 83974 . AAAAG A, 0 QUAL DP=11;MLEAC=0,0;MLEAF=0.00,0.00;MQ=39.57;MQ0=0 GT:AD:GQ:PL:SB 0/0:0,0,0:0:0,0,0,0,0,1:0,0,0,0
    4 191003250 . A C, 0 QUAL DP=14;MLEAC=0,0;MLEAF=0.00,0.00;MQ=38.91;MQ0=0 GT:AD:GQ:PL:SB 0/0:5,0,0:15:0,15,225,15,225,225:1,4,0,0
    1 66337 . TATTATATAATATAATATATATTATATAAATATAATATATAA T, 0 QUAL DP=39;MLEAC=0,0;MLEAF=0.00,0.00;MQ=39.69;MQ0=0 GT:AD:GQ:PL:SB 0/0:25,0,0:71:0,71,1022,76,1026,1030:19,6,0,0

Sign In or Register to comment.