We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Weird behaviour of SelectVatiants

Dear all,
I need split a multiple sample VCF file into one sample files. I used Select Variants with:
--num_threads 4 -R /data/resources/chr_hg19.fa -T SelectVariants --variant /datos/samples.vcf -o SID158.vcf -sn SID158
to select in this case information of sample SID158. But I obtain weird results:
1) Allele frecuences are badly changed, for example turns form 0,8 to 0,0 (1/1:0,8:8:24:319,24,0 --> 1/1:0,0:8:24:319,24,0)
I don't see references about changes in allele depth using select variants, i don't undertand also why these values should be change, suld you explain o give me a link to learn about this?
a)After select Variants
chr7 35293972 . A G 286.14 PASS AC=40;AF=0.870;AN=46;DP=182;Dels=0.00;HRun=0;MQ0=0;set=variant-variant2-variant5-variant6-variant8-variant11-variant14-variant15-variant18-variant23-variant25-variant26-variant28-variant30-variant31-variant33-variant34-variant35-variant44-variant45-variant46-variant49-variant51 GT:AD:DP:GQ:PL 1/1:0,8:8:24:319,24,0 1/1:0,8:8:24:321,24,0 ./. ./. 1/1:0,9:9:27:343,27,0 1/1:0,2:2:6:73,6,0 ./. 1/1:0,2:2:6:80,6,0 ./. ./.
b)Before select Variants.
chr7 35293972 . A G 286.14 PASS AC=2;AF=1.00;AN=2;DP=8;Dels=0.00;HRun=0;MQ0=0;set=variant-variant2-variant5-variant6-variant8-variant11-variant14-variant15-variant18-variant23-variant25-variant26-variant28-variant30-variant31-variant33-variant34-variant35-variant44-variant45-variant46-variant49-variant51 GT:AD:DP:GQ:PL 1/1:0,0:8:24:319,24,0
2) Like above it seems allele frequence is recalculated to a bad value, and this time the variant is selected twice:
0/1:3684,3369:7054:99:103887,0,114881 --> 0/1:0,7:7054:99:103887,0,114881
After chr2 179417938 . G A 103886.87 PASS AB=0.522;AC=1;AF=0.500;AN=2;BaseQRankSum=4.370;DP=7063;Dels=0.00;FS=5.537;HRun=1;HaplotypeScore=173.4470;MQ=59.48;MQ0=0;MQRankSum=0.468;QD=14.71;ReadPosRankSum=0.312;SB=-52336.28;set=variant5 GT:AD:DP:GQ:PL ./. ./. ./. ./. 0/1:3684,3369:7054:99:103887,0,114881 ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./.
Before:
chr2 179417938 . G A 103886.87 PASS AB=0.522;AC=1;AF=0.500;AN=2;BaseQRankSum=4.370;DP=7054;Dels=0.00;FS=5.537;HRun=1;HaplotypeScore=173.4470;MQ=59.48;MQ0=0;MQRankSum=0.468;QD=14.71;ReadPosRankSum=0.312;SB=-52336.28;set=variant5 GT:AD:DP:GQ:PL ** 0/1:0,7:7054:99:103887,0,114881**
chr2 179417938 . G A 103886.87 PASS AB=0.522;AC=1;AF=0.500;AN=2;BaseQRankSum=4.370;DP=7054;Dels=0.00;FS=5.537;HRun=1;HaplotypeScore=173.4470;MQ=59.48;MQ0=0;MQRankSum=0.468;QD=14.71;ReadPosRankSum=0.312;SB=-52336.28;set=variant5 GT:AD:DP:GQ:PL 0/1:0,7:7054:99:103887,0,114881
3) In substitutions the sum of allele frequencies are not equal to DP value, for example: 1/1:0,3:4:9:103,9,0
Thanks a lot
David.
Answers
Hi David, that is strange. Can you tell me what version of GATK you are using?
Hi Geraldine, i'm using GATK 2.2-15 .
Hi David,
It seems this bug is coming from the multithreaded mode. Same for CombineVariants.
Try running without the -nt option. Does it reproduce?
Hi David, @igcocole seems to be onto something -- can you do as he suggests (run without -nt) and let us know what happens?
Hi there,
As noted here, we're unable to reproduce this behavior; in our hands these tools work perfectly fine in multithreaded mode. It looks like it might be an issue with your filesystem not handling the multithreading operations properly. We're looking at ways to test for this problem in order to be able to issue a warning to users when this happens, but we unfortunately don't foresee being able to fix it. At this point all we can say is that if you're experiencing this problem you should run the tools without
-nt
.If anyone else experiences these issues let us know. If we get more cases we may able to find out what they have in common and pinpoint the precipitating conditions.
Hi igcocole and Geraldine, thank you so much for the information, i will try you say when i have a minute and send you all information possible, my apologies for don't answer before. I expect sen you information soon. Thank you again