SelectVariants by DP >= 30

rcholicrcholic DenverPosts: 68Member

With the vcf output from GATK, I used SelectVariants to select variants with the following conditions:

java -Xmx8g -jar $CLASSPATH/GenomeAnalysisTK.jar \
-T SelectVariants \
-R GATK_ref/hg19.fasta \
-nt 5 \
-V ../GATK/VQSR/parallel_batch/Indels/exome.indels.filtered.vcf \
--excludeNonVariants \
-o ../GATK/VQSR/parallel_batch/Indels/exome.indels.filtered.selected.vcf \
-selectType INDEL \
-select "DP > 30.0"

In the output file exome.indels.filtered.selected.vcf, however, I find some variants have DP < 30, for example:

1/1:0,3:3:12:113,12,0

The bold highlighted 3 is the DP, does this mean SelectVariants did not work on my vcf?

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,023Administrator, GATK Dev admin

    Hi @rcholic,

    I'm not entirely sure this will do it but can you try running again, this time writing the DP cutoff value as an integer (30) instead of a float (30.0). Internally depth should be represented as an integer, and I know JEXL is finicky about types...

    Geraldine Van der Auwera, PhD

  • rcholicrcholic DenverPosts: 68Member

    thanks Geraldine for reply. With int 30 in the -select still produced something like this "1/1:0,1:1:3:34,3,0". I guess it does not work for me, I will have to use awk to filter it.

  • pdexheimerpdexheimer Posts: 475Member, GATK Dev, DSDE Dev mod

    I think it's filtering on the DP INFO field (ie, across the entire cohort) rather than the DP FORMAT field which is sample-specific

Sign In or Register to comment.