No 0/0 and I see 0/1, 1/1 and 1/2 GATK SNP call

demis001demis001 USAMember
edited February 2017 in Ask the GATK team

Hi All,

I don't see any "0/0" call, the final "*_final_snp.vcf" file has "0/1, 1/1 and 1/2". I also loaded the two sample I run on IGV and saw a few instances. One sample called 0/1 and the second sample called 1/1 when both suppose to be called 0/1, when both samples have C and the ref has T in that locus. Any idea?

java -jar $HOME/bin/exome/GenomeAnalysisTK.jar --version
3.6-0-g89b7209

Line executed for single sample:

java -Djava.io.tmpdir=$TEMP -jar -Xmx100g $HOME/bin/exome/GenomeAnalysisTK.jar -T PrintReads -R ref -I CJM1_realigned_reads_R.bam -BQSR CJM1_recal_data.table -o CJM1_recal_reads.bam -nct 27

java -Djava.io.tmpdir=$TEMP -jar -Xmx100g $HOME/bin/exome/GenomeAnalysisTK.jar -T HaplotypeCaller -R $GENOME -I CJM1_recal_reds.bam -o CJM1_raw_variants_recal.vcf -nct 27

java -Djava.io.tmpdir=$TEMP -jar -Xmx100g $HOME/bin/exome/GenomeAnalysisTK.jar -T SelectVariants -R $GENOME -V CJM1_raw_variats_recal.vcf -selectType SNP -o CJM1_raw_snps_recal.vcf

java -Djava.io.tmpdir=$TEMP -jar -Xmx100g $HOME/bin/exome/GenomeAnalysisTK.jar -T VariantFiltration -R $GENOME -V CJM1_raw_snps_recal.vcf --filterExpression 'QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0 || SOR > 4.0' --filterName "basic_snp_filter" -o CJM1_filtered_snps_final.vcf

Lines Example:

chr1 265086 . A G 159.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.870;ClippingRankSum=0.000;DP=28;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=5.71;ReadPosRankSum=0.000;SOR=0.582 GT:AD:DP:GQ:PL 0/1:23,5:28:99:188,0,939

chr1 7620221 . A G 63.28 . AC=2;AF=1.00;AN=2;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=48.71;QD=21.09;SOR=1.179 GT:AD:DP:GQ:PL 1/1:0,3:3:9:91,9,0

chr10 18622015 . G A,T 570.77 . AC=1,1;AF=0.500,0.500;AN=2;BaseQRankSum=-1.146;ClippingRankSum=0.000;DP=17;ExcessHet=3.0103;FS=3.274;MLEAC=1,1;MLEAF=0.500,0.500;MQ=61.88;MQRankSum=0.617;QD=33.57;ReadPosRankSum=-1.436;SOR=0.595 GT:AD:DP:GQ:PL 1/2:1,12,4:17:99:599,107,156,441,0,517

Regards,
Dereje

Tagged:

Answers

  • firadazerfiradazer United KingdomMember
    edited February 2017

    There are no "0/0"s because it is same to the reference. Otherwise, there would be tons of "0/0"s in your vcf file.

    Even if both samples have Cs in their reads and the Ref has T in the locus, it is diploid, so "CCCCTTTT" would become 0/1 and "CCCCCCCC" would become 1/1. I guess "CCCCCCCT" would also become 1/1 because the last "T" is probably an error during sequencing and "CCCCCCCT" basically should be "CCCCCCCC". It depends on GATK's cutoff.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @demis001
    Hi Dereje,

    If you want the sites that are hom-ref as well, you will need to run HaplotypeCaller with -ERC then GenotypeGVCFs with -allSites.

    -Sheila

Sign In or Register to comment.