To run the programming "GenotypeGVCFs" too slow for my data, why??

my code:
SAMPLE=" -V d_19_HC.gvcf -V d_F14_HC.gvcf -V d_F8_HC.gvcf -V d_M12_HC.gvcf -V d_M5_HC.gvcf -V d_X4_HC.gvcf -V d_X9_HC.gvcf "

java -Xmx8g -jar $GATK -T GenotypeGVCFs -R $GENOME -nt 32 $SAMPLE -o gvcf_vcf_test1.vcf -L CI01000059 >gvcf_vcf_test1.log 2>&1

There exists some records in my file.log as the below:

WARN 01:12:58,923 FSLockWithShared$LockAcquisitionTask - WARNING: Unable to lock file /lustre/home/xqxia/yxchen/HC_gvcf/gvcf_vcf/d_19_HC.gvcf.idx because an IOException occurred with message: Function not implemented.
INFO 01:12:58,925 RMDTrackBuilder - Could not acquire a shared lock on index file /lustre/home/xqxia/yxchen/HC_gvcf/gvcf_vcf/d_19_HC.gvcf.idx, falling back to using an in-memory index for this GATK run.
WARN 02:46:50,988 FSLockWithShared$LockAcquisitionTask - WARNING: Unable to lock file /lustre/home/xqxia/yxchen/HC_gvcf/gvcf_vcf/d_F14_HC.gvcf.idx because an IOException occurred with message: Function not implemented.

I use 3.6 version GATK.

Answers

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @Lei_Xia,

    When I search our forum with your warn message "FSLockWithShared$LockAcquisitionTask", the top hit I get is this thread. The thread describes the use of the --disable_auto_index_creation_and_locking_when_reading_rods parameter that relates to the WARN you're seeing.

    Also with 32 threads, you may have insufficient memory to go around such that memory spills to disk. Try using fewer threads and see if there is a change in speed. If threading is an issue, then you can try to figure out the optimal threads/memory allocation. Alternatively you could redesign your parallelism to use genomic intervals, e.g. chromosomes as implemented in this workflow for BQSR and differently for the HaplotypeCaller step.

Sign In or Register to comment.