Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Enumerated data type arguments

I am working on a Queue script that uses the selectVariants walker. Two of the arguments that I am trying to use both use an enumerated type: restrictAllelesTo and selectTypeToInclude. I have tried passing these as strings however I get java type mismatch errors. What is the simplest way to pass these parameters to the selectVariant walker in the qscript?
Tagged:
Best Answer
-
pdexheimer ✭✭✭✭
If you're passing this into the SelectVariants walker, a scala enumeration is somewhere between "the hard way" and "won't work" (the scala/java interactions are a little quirky on the edges).
I don't see anything useful in the example scripts currently distributed, but there was an example of this in the DataProcessingPipeline. It would look something like this (uncompiled and untested):
import org.broadinstitute.sting.gatk.walkers.variantutils.SelectVariants.NumberAlleleRestriction class Blah extends QScript { qscript => @Argument(doc="Select only variants of a particular allelicity.", shortName="restrictAllelesTo", required=false) var rAT : String = "BIALLELIC" def getAlleleRestriction : NumberAlleleRestriction = { if (rAT == "BIALLELIC") NumberAlleleRestriction.BIALLELIC else if (rAT == "MULTIALLELIC") NumberAlleleRestriction.MULTIALLELIC else NumberAlleleRestriction.ALL } var selectVar = new SelectVariants selectVar.alleleRestriction = getAlleleRestriction ... }
Answers
Hi there,
Have a look at the presentations from our recent Queue workshop, I think we give some examples of how to pass the different types in there:
http://www.broadinstitute.org/gatk/guide/events?id=3391
Maybe it's my lack of understanding scala but I still can't pass an enumerated data type. The workshop shows this as an example:
@Argument(doc=”gender identity") // some object defined elsewhere var gender: Gender = Gender.UNKNOWN
my code uses this
@Argument(doc="Select only variants of a particular allelicity.", shortName="restrictAllelesTo", required=false) var rAT: restrictAllelesTo = restrictAllelesTo.BIALLELIC
This still errors with "not found: type restrictAllelesTo".
I've even tried defining the object itself:
object restrictAllelesTo extends Enumeration { type restrictAllelesTo = Value val BIALLELIC, ALLELIC, BOTH = Value } import restrictAllelesTo._
However, this then errors with "Can't process command-line arguments of type: scala.Enumeration$Value"
I believe the correct way to define your object should look like this (from a real object defined in the Queue codebase):
If you're passing this into the SelectVariants walker, a scala enumeration is somewhere between "the hard way" and "won't work" (the scala/java interactions are a little quirky on the edges).
I don't see anything useful in the example scripts currently distributed, but there was an example of this in the DataProcessingPipeline. It would look something like this (uncompiled and untested):
@pdexheimer's suggestion should do the trick. Also, I haven't tested it at the moment, but java enums may be parseable by Sting (including Queue) as @Argument variable types, eliminating the need for string-to-enum mapping. FYI- walkers exclusively use Java enums, including the NumberAlleleRestriction shown here.
Ah, I had it backwards - I remembered that one of the enumerated types didn't work properly, and thought it was Java. But I was thinking of the 11th question in the Scala FAQ, which says that Scala enums can't be used as arguments
Thanks a bunch guys, I think I'll be able to get it working from here.
Hi,
I have a similar Problem, but I want to select for SNPs.
Although org.broadinstitute.variant.variantcontext.VariantContext.Type is accessible, there doesn't seem to be a type SNP.
Running the following SelectVariants walker:
val selectVars = new SelectVariants with UnifiedGenotyperArguments selectVars.variant = "Variations.vcf" selectVars.select = Seq("QD < 2.0 || FS > 60.0 || MQ < 40.0 || HaplotypeScore > 13.0 || MappingQualityRankSum < -12.5 || ReadPosRankSum < -8.0") selectVars.selectTypeToInclude = List[org.broadinstitute.variant.variantcontext.VariantContext.Type.SNP] selectVars.restrictAllelesTo = org.broadinstitute.sting.gatk.walkers.variantutils.SelectVariants.NumberAlleleRestriction.BIALLELIC selectVars.excludeNonVariants = true selectVars.out = "biallelic_true_SNPS.qual.filtered.vcf" add(selectVars)
gives me the error message:
INFO 17:58:31,089 QScriptManager - Compiling 1 QScript
ERROR 17:58:34,029 QScriptManager - SelectVariants.scala:66: type SNP is not a member of object org.broadinstitute.variant.variantcontext.VariantContext.Type
ERROR 17:58:34,037 QScriptManager - selectVars.selectTypeToInclude = List[org.broadinstitute.variant.variantcontext.VariantContext.Type.SNP]
ERROR 17:58:34,038 QScriptManager - ^
ERROR 17:58:34,137 QScriptManager - one error found
ERROR ------------------------------------------------------------------------------------------
ERROR stack trace
org.broadinstitute.sting.queue.QException: Compile of ./SelectVariants.scala failed with 1 error
at org.broadinstitute.sting.queue.QScriptManager.loadScripts(QScriptManager.scala:71)
at org.broadinstitute.sting.queue.QCommandLine.org$broadinstitute$sting$queue$QCommandLine$$qScriptPluginManager(QCommandLine.scala:95)
at org.broadinstitute.sting.queue.QCommandLine.getArgumentSources(QCommandLine.scala:227)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:202)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62)
at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
Other than NumberAlleleRestriction, I don't see a way to set the List<VariantContext.Type> directly.
Help much appreciated,
David
Here is a snippet from my code that I used to get it working, I wanted to be able to change the type selection on the fly so I had to define a function to convert the list of inputed strings into a list of "Type".
Hopefully this helps.
@DavidRies -
SNP is definitely one of the members of VariantContext.Type, so I think the error you're getting from the Scala compiler is misleading. Is it possible that you're not even seeing the org.broadinstitute.variant packages? They've moved out of the main GATK repository and into a separate "variant" jar, which I believe is hosted and built by the Picard team. It's possible - especially if you built Queue yourself - that you somehow lost that dependency.
@ynnus -
That did the trick. I suppose my mistake was, that I tried to directly assign the type to selectVars.selectTypeToInclude,
rather than first creating the empty list, adding the types and then assigning it. I now do it directly like this:
`
var LtypeSelect2: List[Type] = Nil
LtypeSelect2 :+ Type.INDEL
Thanks for your help!
David
I don't know how to edit my previous post, so:
It has to be
LtypeSelect2 :+= Type.INDEL
Hi,
I am trying to use Queue for Variant Calling and then Hard-Filtering.
So, I am trying to select the variants by INDEL and SNPs in two steps.
Below is my scala script:
import org.broadinstitute.gatk.queue.QScript
import org.broadinstitute.gatk.queue.extensions.gatk._
class GATKBAMtoHCvcf extends QScript
{
qscript =>
}
But, I am getting error
[[email protected] GATKBAM_HC]$ java -jar /opt/Queue-3.5/Queue.jar -S GATKBAM_HC.scala -jobRunner ParallelShell -I ../NA12878D_GATK.bam --indelFilterNames "My_Indel_Filter" --indelFilterExp "QD < 2.0 ||FS > 200.0 || ReadPosRankSum < -20.0" --snpFilterNames "My_SNP_Filter" --snpFilterExp "QD < 2.0 || FS > 60.0 || MQ < 40.0 || HaplotypeScore > 13.0 || MappingQualityRankSum < -12.5 || ReadPosRankSum < -8.0"
INFO 17:41:27,016 QScriptManager - Compiling 1 QScript
ERROR 17:41:27,276 QScriptManager - GATKBAM_HC.scala:37: '=' expected but ';' found.
ERROR 17:41:27,286 QScriptManager - var filterMode1 :+= Type.SNP
ERROR 17:41:27,287 QScriptManager - ^
ERROR 17:41:27,288 QScriptManager - GATKBAM_HC.scala:39: '=' expected but ';' found.
ERROR 17:41:27,296 QScriptManager - @Argument(doc="A optional list of SNPfilter names.", shortName="snpFilterName", required=false)
ERROR 17:41:27,296 QScriptManager - ^
ERROR 17:41:27,303 QScriptManager - four errors found
ERROR ------------------------------------------------------------------------------------------
ERROR stack trace
org.broadinstitute.gatk.queue.QException: Compile of GATKBAM_HC.scala failed with 4 errors
at org.broadinstitute.gatk.queue.QScriptManager.loadScripts(QScriptManager.scala:79)
at org.broadinstitute.gatk.queue.QCommandLine.org$broadinstitute$gatk$queue$QCommandLine$$qScriptPluginManager$lzycompute(QCommandLine.scala:94)
at org.broadinstitute.gatk.queue.QCommandLine.org$broadinstitute$gatk$queue$QCommandLine$$qScriptPluginManager(QCommandLine.scala:92)
at org.broadinstitute.gatk.queue.QCommandLine.getArgumentSources(QCommandLine.scala:229)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:205)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.queue.QCommandLine$.main(QCommandLine.scala:61)
at org.broadinstitute.gatk.queue.QCommandLine.main(QCommandLine.scala)
ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Compile of GATKBAM_HC.scala failed with 4 errors
ERROR ------------------------------------------------------------------------------------------
INFO 17:41:27,417 QCommandLine - Shutting down jobs. Please wait...
Is there any specific format to feed INDEL/SNP value to SelectVariants?
I tried various suggestions by searching on google and other GATK related blogs, but couldn't get it working.
I think, I am having trouble with "import the correct type object".
Is there any documentation regarding how or what to import for each tools/GATK Walker.
I couldn't find any working information.
For most of the solution provided, the import statements similar to:
import org.broadinstitute.variant.variantcontext.VariantContext.Type
import org.broadinstitute..gatk.tools.variant.variantcontext.VariantContext.Type.SNP
is suggested, but looks like its not working for current Queue version.
If you still run into problems (you say it's not working?) please post the error message you get.
Hi, I am still getting error.
This is how my script looks like:
import org.broadinstitute.gatk.queue.QScript
import org.broadinstitute.gatk.queue.extensions.gatk._
import org.broadinstitute.variant.variantcontext.VariantContext.Type
import org.broadinstitute.gatk.tools.variant.variantcontext.VariantContext.Type.SNP
I am getting the error:
[[email protected] Test_5GATKBAM_HC]$ java -jar /opt/Queue-3.5/Queue.jar -S GATKBAM_HC.scala -jobRunner ParallelShell -I ../NA12878D_GATK.bam --indelFilterNames "My_Indel_Filter" --indelFilterExp "QD < 2.0 ||FS > 200.0 || ReadPosRankSum < -20.0" --snpFilterNames "My_SNP_Filter" --snpFilterExp "QD < 2.0 || FS > 60.0 || MQ < 40.0 || HaplotypeScore > 13.0 || MappingQualityRankSum < -12.5 || ReadPosRankSum < -8.0"
INFO 09:59:34,971 QScriptManager - Compiling 1 QScript
ERROR 09:59:35,193 QScriptManager - GATKBAM_HC.scala:3: object variant is not a member of package org.broadinstitute
ERROR 09:59:35,203 QScriptManager - import org.broadinstitute.variant.variantcontext.VariantContext.Type
ERROR 09:59:35,204 QScriptManager - ^
ERROR 09:59:35,207 QScriptManager - GATKBAM_HC.scala:4: object variant is not a member of package org.broadinstitute.gatk.tools
ERROR 09:59:35,213 QScriptManager - import org.broadinstitute.gatk.tools.variant.variantcontext.VariantContext.Type.SNP
ERROR 09:59:35,214 QScriptManager - ^
ERROR 09:59:56,339 QScriptManager - GATKBAM_HC.scala:92: not found: value INDEL
ERROR 09:59:56,340 QScriptManager - selectUnfilteredINDEL.selectType = INDEL
ERROR 09:59:56,341 QScriptManager - ^
ERROR 09:59:56,622 QScriptManager - GATKBAM_HC.scala:138: not found: value SNP
ERROR 09:59:56,623 QScriptManager - selectUnfilteredSNP.selectType = SNP
ERROR 09:59:56,623 QScriptManager - ^
ERROR 09:59:56,674 QScriptManager - 8 errors found
ERROR ------------------------------------------------------------------------------------------
ERROR stack trace
org.broadinstitute.gatk.queue.QException: Compile of GATKBAM_HC.scala failed with 8 errors
at org.broadinstitute.gatk.queue.QScriptManager.loadScripts(QScriptManager.scala:79)
at org.broadinstitute.gatk.queue.QCommandLine.org$broadinstitute$gatk$queue$QCommandLine$$qScriptPluginManager$lzycompute(QCommandLine.scala:94)
at org.broadinstitute.gatk.queue.QCommandLine.org$broadinstitute$gatk$queue$QCommandLine$$qScriptPluginManager(QCommandLine.scala:92)
at org.broadinstitute.gatk.queue.QCommandLine.getArgumentSources(QCommandLine.scala:229)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:205)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.queue.QCommandLine$.main(QCommandLine.scala:61)
at org.broadinstitute.gatk.queue.QCommandLine.main(QCommandLine.scala)
ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Compile of GATKBAM_HC.scala failed with 8 errors
ERROR ------------------------------------------------------------------------------------------
INFO 09:59:56,745 QCommandLine - Shutting down jobs. Please wait...
Here, I am not sure what to import (So, if there is some specific information/guide that will be great.
Thanks,
Amit