This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
GATK filter by minor allele frequency ?
I am reading a research paper that uses GATK to call variants and filtration.
The method description goes:
"In addition to the default filters in GATK, variants were further filtered for genotype minimum quality of 30, minimum quality over depth of 5, minimum strand bias -0.10 and maximum fraction of reads with mapping quality of zero at 10%. Annotated variants were subsequently filtered to exclude the variants greater or equal to 1% of minor allele frequency based on dbSNP135 and the 1000 genome project and the NHLBI Exome Variant server (EVS). "
I want to make sure I understand how the authors did the filtration. Below is my guess - needs your help to confirm:
java -Xmx2g -jar GenomeAnalysisTK.jar \ -R ref.fasta \ -T VariantFiltration \ --filterExpression "GQ >= 30" \ --filterExpression " DP >= 5" \ --filterExpression "SB >= -2" \ --filterExpression "MQ0 <= 0.1"
Then annotate the variants, I don't know how to "exclude the variants greater or equal to 1% of minor allele frequency based on dbSNP135 and the 1000 genome "??
What is minor allele frequency (MAF)? and how do you exclude variants based on MAF?
Is MAF selected by the "AF" field in VCF files? Should I use the SelectVariants of GATK to do something like this?
thanks for help