The Frontline Support team will be offline February 18 for President's Day but will be back February 19th. Thank you for your patience as we get to all of your questions!
Two validated variants missed by HaplotypeCaller using MIP data (amplicon like data)
We are using MIPs (amplicon like) data to analyze the variants for certain genes. However, in two independent samples two validated variants were missed by the HaplotypeCaller. We were wondering if you have any idea why these variants were not called?
I've used the latest version of GATK (3.6) and the two commands we performed are:
--filter_mismatching_base_and_quals -R hs_ref_GRCh37.p5_all_contigs.fa -I sample1.sorted.bam -T HaplotypeCaller --emitRefConfidence GVCF -L targets.bed --dbsnp dbsnp_137.hg19.vcf -rf BadCigar -stand_call_conf 30.0 -stand_emit_conf 30.0 -nct 1 -o sample1_haplotypecaller.g.vcf
--filter_mismatching_base_and_quals -R hs_ref_GRCh37.p5_all_contigs.fa -I sample2.sorted.bam -T HaplotypeCaller --emitRefConfidence GVCF -L targets.bed --dbsnp dbsnp_137.hg19.vcf -rf BadCigar -stand_call_conf 30.0 -stand_emit_conf 30.0 -nct 1 -o sample2_haplotypecaller.g.vcf
Attached you will find two pictures of the used bam files. The mapping quality of the variant-reads look similar compared to the reference-reads(~60) as well as the base phred quality (~36). I've tried also many other settings/arguments for example by lowering the minimum phred-scaled confidence threshold at which variants should be called and the minimum phred-scaled confidence threshold at which variants should be emitted. Nothing worked to call the variants, However, if I use a smaller target region I am able to call the variant located on chr8.
The output of the GVCF gave:
chr14 31355353 . C . . END=31355353 GT:DP:GQ:MIN_DP:PL 0/0:987:0:987:0,0,11170
chr8 117861187 . G . . END=117861187 GT:DP:GQ:MIN_DP:PL 0/0:1253:0:1253:0,0,20903
Thank you very much in advance!