A suddenly stop when running GenotypeGVCF.

cougarljcougarlj Hong KongMember

The input script is like this:
java -jar /nfs/home/gatk/GenomeAnalysisTK.jar
-T GenotypeGVCFs
-R /nfs/home/tool/gatk/bundle/2.8/hg19/ucsc.hg19.fasta
--variant /home/sample/sample4/AD_1.g.vcf
--variant /home/sample/sample4/AD_2.g.vcf
--variant /home/sample/sample4/AD_3.g.vcf
--variant /home/sample/sample4/AD_4.g.vcf
-o /home/sample/sample4/AD_raw_variants.vcf

It was running smoothly, but suddenly it stopped at chr19. The screen is like this:
INFO 17:47:12,334 ProgressMeter - chr19:11563684 5.32049E7 7.6 h 8.6 m 85.1% 9.0 h 80.0 m
INFO 17:47:42,335 ProgressMeter - chr19:12766650 5.3285579E7 7.6 h 8.6 m 85.2% 9.0 h 79.8 m
INFO 17:47:47,835 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NumberFormatException: For input string: "495"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:481)
at java.lang.Integer.parseInt(Integer.java:527)
at org.broadinstitute.gatk.tools.walkers.variantutils.ReferenceConfidenceVariantContextMerger.merge(ReferenceConfidenceVariantContextMerger.java:161)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:257)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:129)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: For input string: "495"
ERROR ------------------------------------------------------------------------------------------

jiangli@darwin:~$ ^C

The .g.vcf files are created by CombineGVCFs. So, is there anything wrong in these gVCF files?

Best Wishes,
River Lee

Best Answers

Answers

  • cougarljcougarlj Hong KongMember

    Dear Sheila:

    I try to use the ValidateVariants function, but it doesn't work. My command is like:

    java -jar /nfs/home/jiangli/tool/gatk/GenomeAnalysisTK.jar \
    -T ValidateVariants \
    -R /nfs/home/jiangli/tool/gatk/bundle/2.8/hg19/ucsc.hg19.fasta \
    -V /home/jiangli/sample/sample4/AD_1.g.vcf \
    --validateGVCF \
    --validationTypeToExclude ALL

    The error info is like:

    ERROR ------------------------------------------------------------------------------------------
    ERROR A USER ERROR has occurred (version 3.5-0-g36282e4):
    ERROR
    ERROR This means that one or more arguments or inputs in your command are incorrect.
    ERROR The error message below tells you what is the problem.
    ERROR
    ERROR If the problem is an invalid argument, please check the online documentation guide
    ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ERROR
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ERROR
    ERROR MESSAGE: Argument with name 'validateGVCF' isn't defined.
    ERROR ------------------------------------------------------------------------------------------

    Then I downloaded the newest version GATK3.6 to run the same command, it still doesn't work. The error info is like:

    Exception in thread "main" java.lang.UnsupportedClassVersionError: org/broadinstitute/gatk/engine/CommandLineGATK : Unsupported major.minor version 52.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482:smile:

    I'm really confused. Wish your new reply.:smile:

    Best Wishes,
    River Lee
    :smile:

  • SheilaSheila Broad InstituteMember, Broadie, Moderator
    Accepted Answer

    @cougarlj
    Hi River Lee,

    Yes, you must use the latest GATK version (3.6) to be able to validate the GVCF. But, 3.6 now runs only with Java 1.8 instead of 1.7. The error message from the latest version is because you are using Java 1.7 instead of 1.8.

    -Sheila

  • SteveLSteveL BarcelonaMember

    Hi @Sheila ,
    I guess we don't know if this ever got resolved @cougarlj , but I seem to have encountered a very similar issue.

    I have a cohort of 200 gVCFs that I successfully ran through to VCF using GenotypeGVCFs in version 3.6 all the way. However, since I actually have more samples to process, and I have seen inconsistencies when using CombineGVCF/GenotypeGVCFs in previous versions, I am running the same 200 gVCFs through CombineGVCFs, and then GenoypeGVCFs (version 3.6 as well of course) - I understand that the output should be identical (virtually).

    However, when I take my combinedGVCFs by chromsome, e.g. chr1 with 200 samples, and try to run just GenotypeGVCF on this, I am getting the following error, which looks similar to the problem River was having. It starts fine, so I just show the second half of the output below

     module load java/1.8.0u31;
     java -Xmx21000m -Djava.io.tmpdir=$TMPDIR -jar /apps/GATK/3.6/GenomeAnalysisTK.jar  -T GenotypeGVCFs  -nt 4 -R hsapiens.hs37d5.fasta --variant CombinedGVCFsOut.22.combined.g.vcf.gz --never_trim_vcf_format_field  -o CombinedGVCFsOut.22.genotyped.vcf.gz
    

    INFO 22:52:08,458 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 22:52:08,458 ProgressMeter - | processed | time | per 1M | | total | remaining
    INFO 22:52:08,459 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
    DEBUG 2016-09-14 22:52:08 BlockCompressedOutputStream Using deflater: Deflater
    WARN 22:52:08,551 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    WARN 22:52:08,553 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    INFO 22:52:08,553 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files
    WARN 22:52:28,151 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs

    ERROR --
    ERROR stack trace

    java.lang.NumberFormatException: For input string: "."
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.parseInt(Integer.java:615)
    at org.broadinstitute.gatk.tools.walkers.annotator.StrandBiasTest.encodeSBBS(StrandBiasTest.java:357)
    at org.broadinstitute.gatk.tools.walkers.annotator.StrandBiasTest.getTableFromSamples(StrandBiasTest.java:189)
    at org.broadinstitute.gatk.tools.walkers.annotator.FisherStrand.calculateAnnotationFromGTfield(FisherStrand.java:112)
    at org.broadinstitute.gatk.tools.walkers.annotator.StrandBiasTest.annotate(StrandBiasTest.java:130)
    at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:221)
    at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:203)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:306)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:264)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:132)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
    at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.6-0-g89b7209):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions https://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: For input string: "."
    ERROR ------------------------------------------------------------------------------------------

    JOB CANCEL : Error in java application

    Therefore, I tried to run ValidateVariants as you suggested above on one of the Combined GVCF files, which seemed to be working OK, but then it broke very near the end (inside the last Megabase) of the process. Again, I only show the end of the output file:

    java -Xmx21000m -Djava.io.tmpdir=$TMPDIR -jar /apps/GATK/3.6/GenomeAnalysisTK.jar -T ValidateVariants -R hsapiens.hs37d5.fasta -V CombinedGVCFsOut.22.combined.g.vcf.gz --validateGVCF

    ...
    INFO 23:32:16,213 ValidateVariants - Reference allele is too long (122) at position 22:51082105; skipping that record. Set --reference_window_stop >= 122

    ERROR --
    ERROR stack trace

    java.lang.NullPointerException
    at org.broadinstitute.gatk.tools.walkers.variantutils.ValidateVariants.onTraversalDone(ValidateVariants.java:255)
    at org.broadinstitute.gatk.tools.walkers.variantutils.ValidateVariants.onTraversalDone(ValidateVariants.java:126)
    at org.broadinstitute.gatk.engine.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
    at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:116)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:311)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:255)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:157)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.6-0-g89b7209):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions https://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Code exception (see stack trace for error itself)
    ERROR ------------------------------------------------------------------------------------------

    Any help would be greatly appreciated.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @SteveL
    Hi,

    Can you confirm you are running the exact same version of GATK throughout your entire analysis? Did you generate the GVCFs using version 3.6?

    Thanks,
    Sheila

  • SteveLSteveL BarcelonaMember

    Hi @Sheila, Yes I can confirm that for all steps, for this chromosome at least, I used version 3.6-0-g89b7209, which I think is the major release version. For a couple of the other chromosomes I have a few java hash issues, and had to use a nightly build, but I have checked my CMDs for this chromosome and it was not affected.

    Thanks, Steve

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @SteveL
    Hi Steve,

    Hmm. Can you submit a bug report? Instructions are here.

    Thanks,
    Sheila

Sign In or Register to comment.