Drop filtered variants VariantFiltration

Hi,

I am using GATK VariantFiltration tool to do some hardfiltering of variants and it works fine. However, the total variants remain same before and after filtering by marking the variants "PASS" that pass the filter. I explored through the documentation and forum to find out if there is a way to drop the variants from the file that do not meet the filtering criteria but couldn't find. Could someone give any suggestions to fix this.

Answers

  • meharmehar Member ✭✭

    Follow-up to my previous question, DP gives the depth for both REF and ALT alleles. Could it be possible to filter on only depth of the reads for ALT allele?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    To your first question, use SelectVariants with the --excludeFiltered argument (see doc).

    To the second, you can use the AD field (depth per allele) for that purpose.

  • meharmehar Member ✭✭

    Hi,

    I have posted a question prior to this and it showed something like "it will be posted once it is reviewed/accepted". I wonder whether it is posted or not as it is the first time i saw such kind of message. Could someone from GATK team let me know.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Sorry @mehar, it seems your question was eaten by our overzealous spam filter. I've verified your account so it won't happen again, but you'll need to post the question again. Sorry for the inconvenience.

  • meharmehar Member ✭✭

    Thanks for answering!! It was a big question in detail about VariantFiltration and SelectVariants. Anyways my problem is solved that i don't need to post it again.

  • meharmehar Member ✭✭

    Hi again,

    I have checked the documentation about using JEXL to evaluate arrays.

     java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -R b37/human_g1k_v37.fasta --variant my.vcf -select 'vc.getGenotype("NA12878").getAD().0 > 10'
    

    In the above command it "NA12878" is the sample name. Could it be possible to exclude the sample name? Because in case of a VCF file with multiple samples it would be hard to specify multiple sample names in the above format. Also, in case when the sample name is not a priority to filter upon it would be more flexible if we can exclude the sample name just as we do with other annotations where we specify like QD > 2 ...

Sign In or Register to comment.