VariantRecalibrator file lock error

kschweigkschweig californiaMember

Hi,
Im generating a vcf file using the HaplotypeCaller thusly:
java -jar /usr/local/src/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R /data/Genomes/GATK/resources/bwaIndex/ucsc.hg19.fasta \
-I reduced_reads.bam \
--genotyping_mode DISCOVERY \
-stand_emit_conf 10 \
-stand_call_conf 30 \
-o raw_variants.vcf

And when I go to run the following command I get an error regarding file lock:
java -jar /usr/local/src/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar \
-T VariantRecalibrator \
-R /data/Genomes/GATK/resources/bwaIndex/ucsc.hg19.fasta \
-input raw_variants.vcf \
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 /data/Genomes/GATK/resources/hapmap_3.3.hg19.vcf \
-resource:omni,known=false,training=true,truth=true,prior=12.0 /data/Genomes/GATK/resources/1000G_omni2.5.hg19.vcf \
-resource:1000G,known=false,training=true,truth=false,prior=10.0 /data/Genomes/GATK/resources/1000G_phase1.snps.high_confidence.hg19.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /data/Genomes/GATK/resources/dbsnp_137.hg19.vcf \
-an DP \
-an QD \
-an FS \
-an MQRankSum \
-an ReadPosRankSum \
-mode SNP \
-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
-recalFile recalibrate_SNP.recal \
-tranchesFile recalibrate_SNP.tranches \
-rscriptFile recalibrate_SNP_plots.R

INFO 10:12:03,230 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:12:03,232 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51
INFO 10:12:03,232 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 10:12:03,232 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 10:12:03,235 HelpFormatter - Program Args: -T VariantRecalibrator -R /data/Genomes/GATK/resources/bwaIndex/ucsc.hg19.fasta -input raw_variants_vcftools.recode.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /data/Genomes/GATK/resources/hapmap_3.3.hg19.vcf -resource:omni,known=false,training=true,truth=true,prior=12.0 /data/Genomes/GATK/resources/1000G_omni2.5.hg19.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 /data/Genomes/GATK/resources/1000G_phase1.snps.high_confidence.hg19.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /data/Genomes/GATK/resources/dbsnp_137.hg19.vcf -an DP -an QD -an FS -an MQRankSum -an ReadPosRankSum -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -recalFile recalibrate_SNP.recal -tranchesFile recalibrate_SNP.tranches -rscriptFile recalibrate_SNP_plots.R
INFO 10:12:03,235 HelpFormatter - Date/Time: 2014/01/09 10:12:03
INFO 10:12:03,235 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:12:03,235 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:12:03,247 ArgumentTypeDescriptor - Dynamically determined type of raw_variants_vcftools.recode.vcf to be VCF
INFO 10:12:03,249 ArgumentTypeDescriptor - Dynamically determined type of /data/Genomes/GATK/resources/hapmap_3.3.hg19.vcf to be VCF
INFO 10:12:03,250 ArgumentTypeDescriptor - Dynamically determined type of /data/Genomes/GATK/resources/1000G_omni2.5.hg19.vcf to be VCF
INFO 10:12:03,251 ArgumentTypeDescriptor - Dynamically determined type of /data/Genomes/GATK/resources/1000G_phase1.snps.high_confidence.hg19.vcf to be VCF
INFO 10:12:03,252 ArgumentTypeDescriptor - Dynamically determined type of /data/Genomes/GATK/resources/dbsnp_137.hg19.vcf to be VCF
INFO 10:12:03,298 GenomeAnalysisEngine - Strictness is SILENT
INFO 10:12:03,373 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 10:12:34,409 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.7-4-g6f46d11):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: Timeout of 30000 milliseconds was reached while trying to acquire a lock on file /mnt/gnas1/Analysis/WEX/GATK/200-013-3_20_13_L7.LB2/raw_variants_vcftools.recode.vcf.idx. Since the GATK uses non-blocking lock acquisition calls that are not supposed to wait, this implies a problem with the file locking support in your operating system.
ERROR ------------------------------------------------------------------------------------------

I read some posts regarding bad vcf header but I validates with vcftools:
vcf-validator raw_variants.vcf
yielded no output so I assume its OK.

I then tried to use vcftools to make another vcf and indexed it with igvtools:
vcftools --vcf raw_variants.vcf --out raw_variants_vcftools --recode
which gave raw_variants_vcftools.recode.vcf
then igvtools index raw_variants_vcftools.recode.vcf

I tried to rerun the GATK command on the new vcf and I got the same error.
Any ideas why this is happening. I am running RHEL6.
thanks so much,
Karl

Best Answer

Answers

  • kschweigkschweig californiaMember

    I found that when I copy the vcf and its index to a different directory, not remotely mounted, that the error goes away. It must be something strange about how the code interacts with the remotely mounted filesystem. There are no permissions issues, and GATK reads/writes other data into that filesystem. I am using java 1.7 however, so maybe that is an issue. Its difficult to roll back to Java 1.6 on an RHEL machine, and not easy either to make the two co-exist. I will await the next incarnation of the GATK tools for Java 1.7. Thanks much for the help.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    The current GATK versions actually require Java 1.7 (since 2.7 if I remember correctly) so you're fine there. I don't expect that you're going to see any difference with a future version of GATK regarding your remote mount system, as we're not currently doing any infrastructure work that would affect this behavior, so I would encourage you to find a workaround to use the current tools.

  • gabowgabow Member

    This is too bad. We've had this problem too and I was hoping there would be a way to give a timeout parameter, given that FSLockWithShared(File file) already has an alternate constructor FSLockWithShared(File file, int lockAcquisitionTimeout).

Sign In or Register to comment.