Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

SelectVariants Error

AmandaAmanda North CarolinaMember

Hello,

While trying to use SelectVariants to filter for AlleleFrequency, and trying multiple different ways of writing the expression with and without spaces etc and following the SelectVariants notation on the info page, the selection continues to have an error. Can you please tell me what needs to be done to get this to work correctly to filter by AF?

gatk -T SelectVariants -R formated60.fa -V output_raw.vcf -o Filtered_test3.vcf -selectType SNP -select "AF > 0.2" -select "AF < 0.8"

Log -
INFO 14:51:39,023 HelpFormatter - Program Args: -T SelectVariants -R formate
d60.fa -V output_raw.vcf -o Filtered_test3.vcf -selectType SNP -select AF>0.2 -select AF<0.8
INFO 14:51:39,030 HelpFormatter - Executing as [email protected] on Linux 4.4.0-3
6-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13.
INFO 14:51:39,031 HelpFormatter - Date/Time: 2016/11/23 14:51:38

INFO 14:51:39,032 HelpFormatter - ---------------------------------------------

INFO 14:51:39,033 HelpFormatter - ---------------------------------------------

INFO 14:51:39,074 GenomeAnalysisEngine - Strictness is SILENT
INFO 14:56:48,750 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMP
LE, Target Coverage: 1000
INFO 15:02:29,776 GenomeAnalysisEngine - Preparing for traversal
INFO 15:02:29,821 GenomeAnalysisEngine - Done preparing for traversal
INFO 15:02:29,823 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING
]
INFO 15:02:29,824 ProgressMeter - | processed | time | pe
r 1M | | total | remaining
INFO 15:02:29,825 ProgressMeter - Location | sites | elapsed | s
ites | completed | runtime | runtime
INFO 15:02:59,834 ProgressMeter - Starting 0.0 30.0 s 49
.6 w 100.0% 30.0 s 0.0 s

##### ERROR --------------------------------------------------------------------

ERROR A USER ERROR has occurred (version 3.6-0-g89b7209):
ERROR
ERROR This means that one or more arguments or inputs in your command are

incorrect.

ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online docum

entation guide

ERROR (or rerun your command with --help) to view allowable command-line a

rguments for this tool.

ERROR
ERROR Visit our website and forum for extensive documentation and answers

to

ERROR commonly asked questions https://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have real

ly tried to fix it yourself.

ERROR
ERROR MESSAGE: Invalid JEXL expression detected for select-1 with message

![0,8]: 'AF < 0.8;' < error

Thank you,
Amanda

Best Answer

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @Amanda,

    You can use VariantFiltration to filter by AF. Here's an example command that filters based on each of the conditions. The || functions as a logical or.

    java -jar $GATK -T VariantFiltration -R ref/human_g1k_b37_20.fasta \
            -V sandbox/platinum_NA12878_SNP.vcf.gz \
            --filterExpression "QD < 2.0 || FS > 60.0 || MQ < 40.0 || \
            MQRankSum < -12.5 || ReadPosRankSum < -8.0 || \
            SOR > 3.0 || QUAL < 30" \
            --filterName "basic_filters" \
            -o sandbox/platinum_NA12878_SNP_basic_filters.vcf.gz
    

    So I think the --filterExpression you want would be "AF < 0.2 || AF > 0.8" and this removes those variants with AF less than 0.2 or greater than 0.8.

  • AmandaAmanda North CarolinaMember

    Hi @shlee

    Thank you, I tried running that however am still getting errors.

    gatk -T VariantFiltration -R formated60.fa -V output_raw.vcf -o Filtered_test5_AF.vcf -selectType SNP --filterExpression "AF < 0.2 || AF > 0.8" --filterName "Allele_Freq_Filter"

    ERROR Please do NOT post this error to the GATK forum unless you have realy tried to fix it yourself.
    ERROR
    ERROR MESSAGE: Invalid argument value '<' at position 12.
    ERROR Invalid argument value '0.2' at position 13.
    ERROR Invalid argument value '||' at position 14.
    ERROR Invalid argument value 'AF' at position 15.
    ERROR Invalid argument value '>' at position 16.
    ERROR Invalid argument value '0.8' at position 17.
    ERROR -----------------------------------------------------------------------------------------
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited November 2016

    @Amanda, VariantFiltration does not take the -selectType argument. Here's the tool documentation link again: https://software.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_filters_VariantFiltration.php. You can select for SNPs in a separate command using SelectVariants. If the amended filtration command again does not work for you, then can you post an example record from your VCF? Thanks.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Amanda
    Hi,

    I can confirm @Kurt is correct. I just tested this out. The issue is that the AF field in multiallelic sites will have more than one value. Have a look under "Using VariantContext to access annotations in multiallelic sites" in this article.

    -Sheila

Sign In or Register to comment.