If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
ambiguous explanation of -invMv for SelectVariants
on the SelectVariants doc page (here), it is written that the following command:
java -jar GenomeAnalysisTK.jar \ -T SelectVariants \ -R reference.fasta \ -V input.vcf \ -ped family.ped \ -mv -mvq 50 -invMv \ -o violations.vcf
corresponds to: "Generating a VCF of all the variants that are not mendelian violations. The optional argument '-mvq' together with '-invMv' restricts the selection to sites that have a QUAL score of 50 or less".
Moreover, below, the options are described and it is written:
-mvmeans "Output mendelian violation sites only";
-invMvmeans "Output non-mendelian violation sites only";
-mvqmeans "Minimum GQ score for each trio member to accept a site as a violation".
I hence conclude the following:
-mvalone means "generate a VCF of only the variants that are mendelian violations";
-invMvmeans "generate a VCF of only the variants that are not mendelian violations";
-mv -invMvis not unambiguously defined, does
-invMvtake precedence over
-mv mvq 50means "generate a VCF of only the variants that have QUAL > 50 and that are mendelian violations";
-invMv mvq 50means "generate a VCF of only the variants that have QUAL > 50 and that are not mendelian violations";
-mv -mvq 50 -invMv(as in the example above) is ambiguous.
Am I right? If not, can you explain to me the meanings of these various options?