Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
selectvariants DP for each sample
I run the command
java -jar 3.4.0/GenomeAnalysisTK.jar -R ref.fasta -T SelectVariants --file.vcf -o filt_DP.vcf -select "DP > 2000.0" (just to check)
and I noticed that the variants are selected based on the total DP across samples (29 in my case). How is it possible to select the variants based on the individual DP of each sample? Ideally, I would like all the samples to have DP>=7.0.
Also, how is the total DP calculated? I checked and it is not the sum of the DP of all samples.
Here is one example of my file
chr1 1650845 rs1059831 G A 25434.21 PASS AC=29;AF=0.500;AN=58;BaseQRankSum=-6.170e-01;ClippingRankSum=-3.620e-01;DB;DP=2317;FS=0.000;InbreedingCoeff=-1.0000;MLEAC=29;MLEAF=0.500;MQ=60.00;MQRankSum=0.054;POSITIVE_TRAIN_SITE;QD=11.07;ReadPosRankSum=1.06;SOR=0.672;VQSLOD=33.43;culprit=MQ GT:AD:DP:GQ:PL 0/1:36,42:78:99:1033,0,851 0/1:59,43:102:99:982,0,2194 0/1:42,55:97:99:1347,0,1036 0/1:43,49:92:99:1123,0,1634 0/1:12,21:33:99:520,0,332 0/1:43,38:81:99:947,0,971 0/1:69,61:130:99:1447,0,1688 0/1:37,28:65:99:601,0,942 0/1:32,7:39:90:90,0,799 0/1:45,32:77:99:651,0,1722 0/1:41,22:63:99:541,0,1016 0/1:47,34:81:99:732,0,1138 0/1:40,46:86:99:1109,0,996 0/1:30,25:55:99:572,0,746 0/1:63,44:107:99:900,0,2450 0/1:54,31:85:99:705,0,1991 0/1:40,57:97:99:1372,0,935 0/1:37,36:73:99:810,0,909 0/1:41,30:71:99:687,0,1626 0/1:54,69:123:99:1556,0,1129 0/1:40,34:74:99:778,0,1006 0/1:34,29:63:99:715,0,820 0/1:27,50:77:99:1133,0,605 0/1:46,47:93:99:1205,0,1150 0/1:34,27:61:99:593,0,789 0/1:36,31:67:99:725,0,834 0/1:22,27:49:99:651,0,532 0/1:49,42:91:99:896,0,1163 0/1:40,48:88:99:1114,0,941