If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GATK filter by minor allele frequency ?
I am reading a research paper that uses GATK to call variants and filtration.
The method description goes:
"In addition to the default filters in GATK, variants were further filtered for genotype minimum quality of 30, minimum quality over depth of 5, minimum strand bias -0.10 and maximum fraction of reads with mapping quality of zero at 10%. Annotated variants were subsequently filtered to exclude the variants greater or equal to 1% of minor allele frequency based on dbSNP135 and the 1000 genome project and the NHLBI Exome Variant server (EVS). "
I want to make sure I understand how the authors did the filtration. Below is my guess - needs your help to confirm:
java -Xmx2g -jar GenomeAnalysisTK.jar \ -R ref.fasta \ -T VariantFiltration \ --filterExpression "GQ >= 30" \ --filterExpression " DP >= 5" \ --filterExpression "SB >= -2" \ --filterExpression "MQ0 <= 0.1"
Then annotate the variants, I don't know how to "exclude the variants greater or equal to 1% of minor allele frequency based on dbSNP135 and the 1000 genome "??
What is minor allele frequency (MAF)? and how do you exclude variants based on MAF?
Is MAF selected by the "AF" field in VCF files? Should I use the SelectVariants of GATK to do something like this?
thanks for help