This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
SelectVariants produce empty files
I have 8 samples of genome sequencing data with a different condition. The question is to identify variants for each sample. I followed best practice GATK for variant calling (https://software.broadinstitute.org/gatk/best-practices/workflow?id=11145).
For variant calling i used different combinations:
- HaplotypeCaller -> GenotypeCaller -> SelectVariants
- GenotypeCaller -> HaplotypeCaller -> SelectVariants
- GenotypeCaller -> HaplotypeCaller -> SamSort -> SelectVariants
- GenotypeCaller -> HaplotypeCaller -> SamSort -> SelectVariants(Discovery option)
- GATK 3.4 and GATK 3.8
- HaplotypeCaller -> GenotypeCaller -> VCFTools
There are no error messages, It looks like SelectVariants goes through the whole file but produce empty output.
If they produce limited data I get from 300 GB (VCF file from HaplotypeCaller) to 2 GB (VCF file from SelectVariants). In this case, one sample gets limited counts of SNVs, which is a problem in downstream analysis.
I am unsure if there is some parameter that should be included for the genome data.