Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on October 14, 2019, due to the U.S. holiday. We will return to monitoring the forum on October 15.

SelectVariants: MQRankSum or ReadPosRankSum

KurtKurt ✭✭✭Member ✭✭✭

Hi,

I wouldn't classify this as a bug, but thought I would just put this up here in anyways.

I'm creating a vcf file from exome targeted sequencing, using unified genotyper to just call SNVs on one sample, then using SelectVariants instead of VariantFiltration using hard-cutoffs. This is just a intermediate QC file (something that we can use to do a rough TiTv estimate, concordance to GWAS arrays, etc) in order to make the determination as to whether or not this sample was good enough as is to go into the pool of samples to run unified genotyper/haplotyper caller on together and then VQSR.

So when using the expression;

-select 'MQRankSum > -12.5'

any non-reference homozygote that has no reads containing an alternate allele logically won't have this annotation calculated and since this is missing then this record is removed. Nothing major. I plan on redoing it using the VariantFiltration walker instead and the just use the SelectVariants to pull out PASS records, but I though I would put this up in any case. I was trying to think of a way to use a more complex expression to say if GT was 0/1 then MQRankSum, but couldn't wrap around my head for the case were GT was 1/1 and MQRankSum was present and greater than -12.5.

I was using GenomeAnalysisTK-2.2-4-g4a174fb at the time and my command line is below.

java -jar $GATK/GenomeAnalysisTK.jar \
-T SelectVariants \
-R $REF_GENOME \
--variant $CORE_PATH/$OUTPATH/temp/$SM_TAG".QC.raw.OnBait.vcf" \
-select 'QD > 2.0' \
-select 'MQ > 30.0' \
-select 'FS < 40.0' \
-select 'HaplotypeScore < 13.0' \
-select 'MQRankSum > -12.5' \
-select 'ReadPosRankSum > -8.0' \
-select 'DP > 8.0' \
-o $CORE_PATH/$OUTPATH/SNV/QC/Aggregate/filtered_on_bait/$SM_TAG".QC.OnBait.vcf"

Tagged:

Comments

Sign In or Register to comment.