Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Missing reference allele in GVCF file after running HaplotypeCaller
I used HaplotypeCaller in GVCF mode to generate a single sample GVCF, but when I checked my vcf file I see that the reference allele is not showing up:
22 1 . N <NON_REF> . . END=16050022 GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0 22 16050023 . C <NON_REF> . . END=16050023 GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,37 22 16050024 . A <NON_REF> . . END=16050026 GT:DP:GQ:MIN_DP:PL 0/0:2:6:2:0,6,73 22 16050027 . A <NON_REF> . . END=16050035 GT:DP:GQ:MIN_DP:PL 0/0:3:9:3:0,9,110 22 16050036 . A C,<NON_REF> 26.80 . BaseQRankSum=-0.736;ClippingRankSum=-0.736;DP=3;MLEAC=1,0;MLEAF=0.500,0.00;MQ=27.00;MQ0=0;MQRankSum=-0.736;ReadPosRankSum=0.736 GT:AD:DP:GQ:PL:SB 0/1:1,2,0:3:23:55,0,23,58,29,86:1,0,2,0 22 16050037 . G <NON_REF> . . END=16050037 GT:DP:GQ:MIN_DP:PL 0/0:3:9:3:0,9,109 22 16050038 . A <NON_REF> . . END=16050039 GT:DP:GQ:MIN_DP:PL 0/0:4:12:4:0,12,153
I am not sure where to start troubleshooting for this, since all the steps prior to using HaplotypeCaller did not generate any obvious errors.
The basic command that I used was:
java -Xmx4g -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R hs37d5.fa -I recal_1.bam -o raw_1.vcf -L 22 --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000
Have you encountered this problem before? Where should I start troubleshooting?
Thanks very much in advance,