Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Different variant calls using the intervals options

laportelaporte MontrealPosts: 6Member

Hi,

Using the UnifiedGenotyper, I am having different results when using the --intervals option. Particularly at the edge of a region (at 20bps), some samples that had borderline calls change their genotype when filtering by the interval. I can see that the read counts in the AD Format field stays the same but the PL changes, resulting in a change of genotype call. I was wondering if the intervals options removes reads that spans the region ? what can result in having differnt PL values ?

Without intervals option miSeq003.merge.sort.UnifiedGenotyper.vcf:DNAJC13.1K_flanks.60npl 43638 . A G 7028.01 . AC=45;AF=0.079;AN=572;BaseQRankSum=-257.764;DP=295778;Dels=0.00;FS=3200.000;HaplotypeScore=278.6356;InbreedingCoeff=-0.0850;MLEAC=43;MLEAF=0.075;MQ=58.16;MQ0=0;MQRankSum=-274.890;QD=0.17;ReadPosRankSum=-276.375 GT:AD:ADS:DP:GQ:PL 0/1:679,92:340,339,91,1:734:17:17,0,14068 ......

With intervals miSeq003.merge.sort.UnifiedGenotyper.intervals.vcf:DNAJC13.1K_flanks.60npl 43638 . A G 6459.50 . AC=40;AF=0.070;AN=572;BaseQRankSum=-257.671;DP=295778;Dels=0.00;FS=3200.000;HaplotypeScore=278.6356;InbreedingCoeff=-0.0754;MLEAC=39;MLEAF=0.068;MQ=58.16;MQ0=0;MQRankSum=-274.744;QD=0.18;ReadPosRankSum=-276.325 GT:AD:ADS:DP:GQ:PL 0/0:679,92:340,339,91,1:734:2:0,2,14073 ......

Best Answers

Answers

  • laportelaporte MontrealPosts: 6Member

    Thanks Geraldine. Here is my command line:

    java -Xmx24g -jar /RQexec/dionnela/soft/packages/GATK/dist/GenomeAnalysisTK.jar -T UnifiedGenotyper --input_file ${PROJECT}.merge.sort.bam -R ../DNAJC13.1K_flanks.60npl.fasta -nt 12 -o $VCF --metrics_file ${PROJECT}.merge.sort.UnifiedGenotyper.metrics --genotype_likelihoods_model BOTH --downsample_to_coverage 10000

    As you can see, I use --downsample_to_coverage 10000 and coverage for the loci is below that threshold. Is there another parameter/variable that can generate downsampling effect ?

  • laportelaporte MontrealPosts: 6Member

    Okay, got it. Thanks for the help

Sign In or Register to comment.