If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
SelectVariants modifies VCF entries, keeping only the base calls intact.
I am using GATKv3.5. I used SelectVariants as shown below to remove 11 samples from a vcf file:
java -jar GenomeAnalysisTK.jar -T SelectVariants -R reference.fasta -V all_samples.vcf -xl_sn sample90 -xl_sn sample91 -xl_sn sample92 -xl_sn samples93 -xl_sn sample94 -xl_sn sample95 -xl_sn sample96 -xl_sn sample97 -xl_sn sample98 -xl_sn sample99 -xl_sn sample100 -o subset_samples.vcf
However, when I compare the SNPs between the original VCF and the subset VCF, the 0/0, 0/1, 1/1 genotype calls remain the same, but the AD, DP, GQ, and PL change to the point of nonsense. e.g. a 0,45 AD is called 0/1 (heterozygous). This is the correct call from the original file, where the AD is 53,24, but based on the 0,45 is should be 1/1. As long as the base calls themselves are correct, this shouldn’t cause any downstream errors, but I can’t be sure this is the case. Has anyone else had this error?
KB222897.1 10810 . C T 113425.71 . AC=102;AF=0.359;AN=284;BaseQRankSum=0.698;ClippingRankSum=0.029;DP=8522;ExcessHet=87.0598;FS=0.000;InbreedingCoeff=-0.4686;MLEAC=102;MLEAF=0.359;MQ=41.97;MQRankSum=-1.540e-01;QD=18.08;ReadPosRankSum=0.132;SOR=0.682 GT:AD:DP:GQ:PL 0/1:53,24:77:99:705,0,1819 0/1:39,16:55:99:470,0,1231 0/1:29,21:50:99:589,0,973
KB222897.1 10810 SKB222897.1_10810 C T . PASS AC=96;AF=0.366;AN=262;BaseQRankSum=0.698;ClippingRankSum=0.029;DP=8027;ExcessHet=87.0598;FS=0.000;InbreedingCoeff=-0.4686;MQ=41.97;MQRankSum=-1.540e-01;QD=18.08;ReadPosRankSum=0.132;SOR=0.682;DP=6516 GT:AD:DP:GQ:PL 0/1:0,45:45:99:255,135,0 0/1:0,48:48:99:255,144,0 0/1:0,44:44:99:255,132,0