This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
About variant filtration process..
I have read the tutorial about "(howto) Apply hard filters to a call set" at
Before the variant filtration, we need to 1. Extract the SNPs from the call set
sample code of the tutorial was:
java -jar GenomeAnalysisTK.jar \
-T SelectVariants \
-R reference.fa \
-V raw_variants.vcf \
-L 20 \
-selectType SNP \
Here why -L 20 option is used?? i know it is interval option. but why the value set to 20 explicitly?
secondly, while applying hard filters, parameters like"ClusterSize" and "ClusterWindowSize" are not considered in the best practices, while,
some people use them to remove the false positive variants, i know it's bit irrelevant question, but personally what you think how much this parameter can be important for variant filtration for illumina sequencing data?