This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Why is this variant not called?
I've been playing around with RNA-seq data and came across what seems to be a high quality single nucleotide variant but that isn't detected by HaplotypeCaller.
I followed the Calling variants in RNAseq method, except for the alignment which was carried out by tophat2 (and looks good). Then I used MarkDuplicates, ReorderSam, SplitNCigarReads, BaseRecalibrator, PrintReads and finally HaplotypeCaller.
When I look at the reads that come out of the PrintReads steps with IGV, I can see a lot of high quality reads with the variant sequence (see attachment).
However the HaplotypeCaller doesn't report this site as variant in the VCF output. I first thought it was filtered but setting stand_call_conf and stand_emit_conf to 0.0, and adding EMIT_ALL_SITES doesn't make a difference on this site (although I can see lots of low quality sites appear as expected).
java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -I Cet1_2.OS9_rg.bam -dontUseSoftClippedBases -stand_call_conf 0.0 -stand_emit_conf 0.0 -o Cet1_2.OS9_all_sites.vcf -R Homo_sapiens.GRCh38.dna.primary_assembly.fa -out_mode EMIT_ALL_SITES
This site is called in a replicate experiment (same cell line, different sample prep) and is assigned a very high quality (QD = 25.34, DP=1298). The two samples look very similar and I can't understand what's going on and why the variant isn't called?
Any idea on how to tackle this issue would be welcome.