The current GATK version is 3.6-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# Heterozygous position called as Homozygous

Member Posts: 4
edited June 2013

Dear developers,
I have looked into the forum for similar questions but I couldn't find any. I have several cases in which I get homozygous calls in positions with ~50% of reads calling the mutation (or less), please find here an example of a position validated by Sanger (as het) in which I have a high coverage (~400 reads in total) here is the results with UnifiedGenotyper:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  L1


can you help me in this case? I am really puzzled.
I have run it with v2.2, 2.4 and 2.5 and I always had the same genotype call (the excerpt here is from the 2.5, I have downloaded it just to check it was not caused by a bug already fixed). It's not a downsampling issue since I have high coverage samples (HaloPlex) and used higher dcov than the default.

Post edited by Geraldine_VdAuwera on
Tagged:

Hi there,

Can you please post the command line you used for calling, and if possible a screenshot of the pileup at that position?

Geraldine Van der Auwera, PhD

• Member Posts: 4

Hi Geraldine,
please find attached the screenshot. The majority of reads spanning the variation are from the same amplicon, but you can see at the top also a few reads from a different amplicon and also nearly half of them carry the mutation. Some of the reads at the bottom have been trimmed, probably for low quality of the ends (we trim to remove adapter sequence and Q<20 for HaloPlex data).
Here is the command:

gatk=/opt/software/ngs/lib/GenomeAnalysisTK-2.5-2-gf57256b
reffa=/home/ngsworkspace/references/ucsc.hg19.fa
dbvcf=/home/ngsworkspace/references/gatk/current/dbsnp_137.hg19.vcf
recalbam=/home/users/ngs/exomeAnalysis/Project_Exome4/Aligned/S136.bam
rawsnps=L1_snp.vcf
snpmets=L1_snp.metrics
java -Xmx47g -jar ${gatk}/GenomeAnalysisTK.jar -T UnifiedGenotyper -I$recalbam \
-R $reffa -o$rawsnps -metrics $snpmets -glm SNP -ploidy 2 -D$dbvcf -mbq 30 \
-stand_call_conf 50 -L chr4:998,049-998,172 -nct 16 -dcov 5000
• Dev Posts: 71 mod

Hi there - so, this is a hypothesis of what can be happening (I could be wrong though). Your VCF record has an extremely high value in your BaseQRankSum - this means that reads carrying the ALT allele have a significantly higher quality score than the reads carrying REF at that position (I can't offer a plausible explanation of why this might be the case, as normally we see the opposite). This might be an artifact of your upstream processing but if you can confirm this at least it should point you to where the problem might be.

• Member Posts: 4

Hi delangel,
thank you, you pointed at the right direction: I have rerun the analysis on the realigned bam (before Base Quality Score Recalibration), to check if the base quality unbalance could have been introduced after that step, and the result is an heterozygous call:
chr4 998101 . C T 6853.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=2.273;DP=411;Dels=0.00;FS=0.000;HaplotypeScore=11.6256;MLEAC=1;MLEAF=0.500;MQ=59.94;MQ0=0;MQRankSum=-1.416;QD=16.68;ReadPosRankSum=-0.859 GT:AD:DP:GQ:PL 0/1:203,208:411:99:6882,0,6484
We will extract the qualities of the bases in that position to check their values before and after BQSR, just to check how big the difference is. Could it be that, being the target quite small (it is ~ 500 kb, including ~300 genes) and the reads generated by amplicons (so they are inherently not independent) the BQSR may not be advisable if some basic assumptions in the BQSR model are violated?
Please let me know what you think or if you have any suggestion in changing some parameter that may apply in this kind of data.

In any case, I will be happy to collaborate in the generation of specific guidelines for targeted and the new whole exome HaloPlex sequencing data, since we are working quite a lot with them in the last few months and there are a series of caveats to take into account that may be helpful for the community.

Best,

Margherita