Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

VariantFiltration for non-human data

sagipolanisagipolani Posts: 41Member
edited October 2012 in Ask the GATK team

Hi all,

I'm currently analysing non-human mammalian whole genome data (>30x). No previous variants databases are available.

I'm currently in the VariantFiltration step. I came around the following command which is used for human data, and I'm wondering if it will be good for non-human data:

java -Xmx10g -jar GenomeAnalysisTK.jar \
-R [reference.fasta] \
-T VariantFiltration \
--variant [input.recalibrated.vcf] \
-o [recalibrated.filtered.vcf] \
--clusterWindowSize 10 \
--filterExpression "MQ0 >= 4 && ((MQ0 / (1.0 * DP)) > 0.1)" \
--filterName "HARD_TO_VALIDATE" \
--filterExpression "DP < 5 " \
--filterName "LowCoverage" \
--filterExpression "QUAL < 30.0 " \
--filterName "VeryLowQual" \
--filterExpression "QUAL > 30.0 && QUAL < 50.0 " \
--filterName "LowQual" \
--filterExpression "QD < 1.5 " \
--filterName "LowQD" \
--filterExpression "SB > -10.0 " \
--filterName "StrandBias"

I would appreciate your thoughts on this matter.

Thank you very much!

Sagi

Post edited by Geraldine_VdAuwera on

Best Answer

Answers

Sign In or Register to comment.