The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

#### ☞ Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Did we ask for a bug report?

Then follow instructions in Article#1894.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# SelectVariants: MQRankSum or ReadPosRankSum

Member Posts: 255 ✭✭✭

Hi,

I wouldn't classify this as a bug, but thought I would just put this up here in anyways.

I'm creating a vcf file from exome targeted sequencing, using unified genotyper to just call SNVs on one sample, then using SelectVariants instead of VariantFiltration using hard-cutoffs. This is just a intermediate QC file (something that we can use to do a rough TiTv estimate, concordance to GWAS arrays, etc) in order to make the determination as to whether or not this sample was good enough as is to go into the pool of samples to run unified genotyper/haplotyper caller on together and then VQSR.

So when using the expression;

-select 'MQRankSum > -12.5'

any non-reference homozygote that has no reads containing an alternate allele logically won't have this annotation calculated and since this is missing then this record is removed. Nothing major. I plan on redoing it using the VariantFiltration walker instead and the just use the SelectVariants to pull out PASS records, but I though I would put this up in any case. I was trying to think of a way to use a more complex expression to say if GT was 0/1 then MQRankSum, but couldn't wrap around my head for the case were GT was 1/1 and MQRankSum was present and greater than -12.5.

I was using GenomeAnalysisTK-2.2-4-g4a174fb at the time and my command line is below.

java -jar $GATK/GenomeAnalysisTK.jar \ -T SelectVariants \ -R$REF_GENOME \
--variant $CORE_PATH/$OUTPATH/temp/$SM_TAG".QC.raw.OnBait.vcf" \ -select 'QD > 2.0' \ -select 'MQ > 30.0' \ -select 'FS < 40.0' \ -select 'HaplotypeScore < 13.0' \ -select 'MQRankSum > -12.5' \ -select 'ReadPosRankSum > -8.0' \ -select 'DP > 8.0' \ -o$CORE_PATH/$OUTPATH/SNV/QC/Aggregate/filtered_on_bait/$SM_TAG".QC.OnBait.vcf"

Tagged: