The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

# weird multisample output

LondonMember

Hi, I have run GATK Unified genotyper with multisample and I am getting unexpected output. (Hard filters used)

Here are examples of the output format I get in the vcf for a sample: 0/1:43,4:PASS:24:24,0,4368, ./.:.:PASS or 0/0:2,0:2:DPFilter:6:0,6,47 as well as 0/0:12,0:37:0,37,830. The latter is the format I would expect. What does this mean? Also I get the filter PASS even when all but two samples have ./.:.:PASS and the other two have 1/1:0,1:1:DPFilter;GQFilter:3:38,3,0 and 1/1:0,1:1:DPFilter;GQFilter:3:38,3,0.

Also I ran HalpotypeCaller multisample and get similar output all though more of the 0/0:12,0:37:0,37,830 format....

Tagged:

Nothing to worry about. The samples that are non-reference have either PASS or the name of the filter they failed; the samples that can't be called because there is no coverage have no-call fields (./. etc); and the last are called reference so there is no filtering annotation.

• LondonMember

Thanks for your feedback, but I don't understand why the final Filter category is PASS if across all samples there is not enough information to call at that position or where there is they do not individually pass filter. Is this a bug? Also I don't understand why where there is not enough info to call a variant i.e. ./. why they are given the PASS category e.g. ./.:.:PASS

• LondonMember

Also we get 0/0:3,0:PASS:9:0,9,96 - i.e. reference but still get the filter info....

Two points:

1. There's a difference between site-based filtering and sample genotype-based filtering.
2. Check your JEXL expressions -- some expressions evaluate as true when an annotation is missing.
• LondonMember

Could you elaborate on 1? I am not sure how this would justify a passed call in the example I gave.

Here is my JEXL expression for this data - can you see an issue??

$Java7 -Xmx30g -jar$GATKs/GenomeAnalysisTK.jar \
-R $Ref \ -T VariantFiltration \ --downsample_to_coverage$dcovg \
--downsampling_type BY_SAMPLE \
-o $MyOutDir/$TargetOut/$FamName.UnifiedGenotyper.snp.filtered.vcf \ --variant$MyOutDir/$TargetOut/$FamName.UnifiedGenotyper.snp.vcf \
--mask $MyOutDir/$TargetOut/$FamName.UnifiedGenotyper.indel.vcf \ --maskName InDel \ --filterExpression "QD < 2.0" \ --filterExpression "MQ < 40.0" \ --filterExpression "FS > 60.0" \ --filterExpression "MQRankSum < -12.5" \ --filterExpression "ReadPosRankSum < -8.0" \ --filterName QDFilter \ --filterName MQFilter \ --filterName FSFilter \ --filterName MQRankSumFilter \ --filterName ReadPosFilter\ --genotypeFilterExpression "DP < 4" \ --genotypeFilterName "DPFilter" \ --genotypeFilterExpression "GQ < 5.0" \ --genotypeFilterName "GQFilter" • Cambridge, MAMember, Administrator, Broadie I was responding to this statement: Also I get the filter PASS even when all but two samples have ./.:.:PASS and the other two have 1/1:0,1:1:DPFilter;GQFilter:3:38,3,0 and 1/1:0,1:1:DPFilter;GQFilter:3:38,3,0. and: I don't understand why the final Filter category is PASS if across all samples there is not enough information to call at that position or where there is they do not individually pass filter. The main problem here is that you don't specify which level of filtering you mean in each case. I could be wrong in my interpretation but you seem to be conflating the two levels of filtering. It is quite possible to have a variant call pass a certain filter (at the site level) even if one or more samples fail some filters for this site. The two levels of filtering are independent. • LondonMember Ah, ok. Let me try and explain better. So in multisample output the overall Filter reports PASS, BUT all but two samples for this variant are reported as ./.:.:PASS and the other two have 1/1:0,1:1:DPFilter;GQFilter:3:38,3,0 and 1/1:0,1:1:DPFilter;GQFilter:3:38,3,0. • LondonMember I guess the question is two fold - why is there a PASS annotation for individual samples were there is not enough information to call (i.e. ./.:.:PASS) and why would the overall filter say PASS when it seems there is nothing to report. • Cambridge, MAMember, Administrator, Broadie I see, thanks for clarifying your questions. why is there a PASS annotation for individual samples were there is not enough information to call (i.e. ./.:.:PASS) This is because the JEXL evaluation is myopic, in the sense that it only looks at the annotation you tell it to. This is completely independent of the calling mechanism. And unfortunately it's extremely literal-minded, so if the annotation is absent (because no call was made) the JEXL evaluator only sees that the filtering condition is not met (e.g. "." is not lower than 4) and therefore gives the sample a pass. This is documented as a gotcha of the JEXL evaluation system. The workaround is to add a check ("does the annotation exist?") before performing the evaluation, which acts sort of like a dead man switch. why would the overall filter say PASS when it seems there is nothing to report Same problem as above; the missing annotations lead the JEXL to pass the site because there is nothing there to make it fail. Remember that failing genotype-level filter has no bearing on the site-level filters; and you are not applying any filters at the site level that would fail that site. Does that make sense? • LondonMember Yes it does, thank you - how/where in the pipeline would I build in the "does annotation exist"? Also why are some sample genotype descriptors annotated with PASS or the filter they failed by and not others? • LondonMember Having thought more about it, I know the answer to the latter question. For multi sample, it looks like it is best to just use genotypexpression filters, right? • Cambridge, MAMember, Administrator, Broadie Hi @ledwards, The basic filtering (choosing which variants are likely to be real) is typically done on the site-level annotations, not at the genotype level. The idea is that you first ask the question: "Is there convincing evidence that there is variation at this position in one or more samples in the cohort?". Then for each sample, you ask: "Given that there is variation here in the population, does this particular sample carry the variant?" • LondonMember My question still remains though, why do some sites have the genotype filter applied and others seem not to. see example output below (chr loaction missing) created from this command... Java7 -Xmx30g -jar$GATKs/GenomeAnalysisTK.jar \
-R $Ref \ -T VariantFiltration \ --downsample_to_coverage$dcovg \
--downsampling_type BY_SAMPLE \
-o $MyOutDir/$TargetOut/$FamName.UnifiedGenotyper.snp.filtered.vcf \ --variant$MyOutDir/$TargetOut/$FamName.UnifiedGenotyper.snp.vcf \
--mask $MyOutDir/$TargetOut/\$FamName.UnifiedGenotyper.indel.vcf \
--filterExpression "QD < 2.0" \
--filterExpression "MQ < 40.0" \
--filterExpression "FS > 60.0" \
--filterExpression "MQRankSum < -12.5" \
--filterName QDFilter \
--filterName MQFilter \
--filterName FSFilter \
--filterName MQRankSumFilter \