The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

# How to use UnifiedGenotyper --annotation option

Member Posts: 27
edited January 2013

I am doing human exome sequencing with hg19 as a reference, and I want UnifiedGenotyper to give me whatever annotations are available and I will worry later about which ones are useful and which are not.

I am confused about the behavior of the --annotation option in UnifiedGenotyper. The default value is listed as [], implying that unless we explicitly list what annotations we want, we get no annotations at all? Is that correct? Then in order to get a list of available annotations, we are directed to the VariantAnnotator --list option but it appears that it is not possible to just run:

java -Xmx2g -jar GenomeAnalysisTK.jar \
-R ref.fasta \
-T VariantAnnotator \
--list


In order to get a list of annotations. Instead, one not only needs to include a --variants flag, but the vcf file you point to actually has to be well-formatted, etc., otherwise you get errors like this

##### ERROR MESSAGE: Argument with name '--variant' (-V) is missing.


or this:

##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be
determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:


So, that having failed, is anyone able to just provide me with a list of possible arguments to the UnifiedGenotyper --annotation option?

Post edited by Geraldine_VdAuwera on
Tagged:

That's right - it's because downsampling is done by the engine, and so the -dcov argument is documented with the other core engine command-line arguments.

Re: your annotation question, the UG uses a certain number of annotations by default. The core set are called "standard annotations" and are used by default by all tools (unless otherwise specified by the 'exclude' argument).

Currently, the standard annotations are the following:

• BaseQualityRankSumTest

• ChromosomeCounts

• DepthOfCoverage
• DepthPerAlleleBySample
• FisherStrand
• HaplotypeScore
• InbreedingCoeff
• MappingQualityRankSumTest
• MappingQualityZero
• QualByDepth
• RMSMappingQuality
• SpanningDeletions
• TandemRepeatAnnotator

For the record, the HaplotypeCaller uses the same standard set, except it excludes the last two in the list for technical reasons.

We will add this information to the documentation in the near future.

Geraldine Van der Auwera, PhD

And the full list of annotations is available in the technical documentation at this link:

It is silly that VariantAnnotator refuses to list options without a fully valid command line -- we'll try to fix that in the next release.

Geraldine Van der Auwera, PhD

• Member Posts: 27

PS. the UnifiedGenotyper documentation uses -dcov in one of the examples at top but this argument is never introduced or documented below.

• Member Posts: 27

I see, -dcov is in the docs for GATK walkers more generally, here.

That's right - it's because downsampling is done by the engine, and so the -dcov argument is documented with the other core engine command-line arguments.

Re: your annotation question, the UG uses a certain number of annotations by default. The core set are called "standard annotations" and are used by default by all tools (unless otherwise specified by the 'exclude' argument).

Currently, the standard annotations are the following:

• BaseQualityRankSumTest

• ChromosomeCounts

• DepthOfCoverage
• DepthPerAlleleBySample
• FisherStrand
• HaplotypeScore
• InbreedingCoeff
• MappingQualityRankSumTest
• MappingQualityZero
• QualByDepth
• RMSMappingQuality
• SpanningDeletions
• TandemRepeatAnnotator

For the record, the HaplotypeCaller uses the same standard set, except it excludes the last two in the list for technical reasons.

We will add this information to the documentation in the near future.

Geraldine Van der Auwera, PhD

And the full list of annotations is available in the technical documentation at this link: