Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!
GenotypeGVCFs with draft quality reference genome
I am using the GATK pipeline to call variants by aligning reads to a draft quality reference genome that is ~367000 scaffolds. I split the scaffolds up into 50 intervals and successfully (and pretty quickly) generated GVCFs for 25 individuals using the -L option. However, I am having the worst of times with GenotypeGVCFs. After running for nearly 2 days on the first interval list, GenotypeGVCFs has not even output a file. Based on another post in the forum, I removed the scaffolds that are NOT in the interval from the GVCF header, and that sped up the process slightly - I have a combined VCF file with just the header generated after about 18 hours. Not sure how much longer the process has as the progress meter doesn't seem to be making any sense.
Is there any known way(s) to optimize this process?
Currently using the following command:
java -Djava.io.tmpdir=/data/lwwvd/genoGVCF.tmp -XX:ParallelGCThreads=4 -Xmx15g -jar /usr/local/bin/GenomeAnalysisTK-3.4-46/GenomeAnalysisTK.jar -nt 16 -T GenotypeGVCFs -R ../ref_genomes/bbu_ref_UMD_CASPUR_WB_2.0.fa -L interval_lists/bbub.refctgs.49.interval_list -V ./1095/1095.49.g.vcf.gz -V ./189/189.49.g.vcf.gz -V ./190/190.49.g.vcf.gz -V ./196/196.49.g.vcf.gz -V ./246/246.49.g.vcf.gz -V ./337/337.49.g.vcf.gz -V ./581/581.49.g.vcf.gz -V ./583/583.49.g.vcf.gz -V ./662/662.49.g.vcf.gz -V ./701/701.49.g.vcf.gz -V ./850/850.49.g.vcf.gz -V ./92764/92764.49.g.vcf.gz -V ./92765/92765.49.g.vcf.gz -V ./92766/92766.49.g.vcf.gz -V ./92767/92767.49.g.vcf.gz -V ./92768/92768.49.g.vcf.gz -V ./92769/92769.49.g.vcf.gz -V ./92770/92770.49.g.vcf.gz -V ./92771/92771.49.g.vcf.gz -V ./92774/92774.49.g.vcf.gz -V ./92775/92775.49.g.vcf.gz -V ./92776/92776.49.g.vcf.gz -V ./92777/92777.49.g.vcf.gz -V ./92778/92778.49.g.vcf.gz -V ./92795/92795.49.g.vcf.gz -o BBUB.combined.49.vcf