Using the GATK Unified on a pooling sample
I've got a pool of 80 individuals sequencing data.
Each individual doesn't have index, so that i have just one fastq file and bam file that i cannot sort any sample data from it.
I try to use GATK UnifiedGenotyper v3.3 including "-ploid" option
There're some questions.
- command :
java -Xmx100g -jar /ruby/Tools/GATK/GenomeAnalysisTK-3.3/GenomeAnalysisTK.jar -T UnifiedGenotyper -R ./Ref.fasta -I 1.bam -o 1_unified.vcf --sample_ploidy 160 -minIndelFrac 0.05 --genotype_likelihoods_model BOTH -pnrm EXACT_GENERAL_PLOIDY -nct 4 -nt 10
--sample_ploidy = 160 = 80 (pooling 80 individuals) * ploid ( diploid, 2 )
bamfile size = 3.4GB
SERVER SPEC :
CPU core = 40x
memory = 256G
I've started the process 4 days ago.
The progress percent is 0.3%, and remain time is 176.9 weeks.
I think it's too slow to complete.
I just wonder how long takes time to process GATK UnifiedGenotyper on that data.
Is there any recommendations to improve this job?