GATK 4.beta.2 vs. 4.beta.3 performance

Hi, I've been keeping up with the GATK 4.0 beta releases and I've noticed some performance differences between my runs of version 4.beta.2 and 4.beta.3. That is, 4.beta.3 tools seem to have a longer execution time. The runtimes and tools in question are:

Tool 4.beta.2 4.beta.3
BaseRecalibrator 1m 3s 3m 3s
ApplyBQSR (scattered) 4m 48s 11m 51s
HaplotypeCaller (scattered) 23m 42s 29m 7s
GenotypeGVCFs (scattered) 4m 6s 9m 28s
VariantRecalibrator (for SNPs) 4m 7s 6m 38s
VariantRecalibrator (for INDELs) 2m 7s 4m 8s
ApplyVQSR (for SNPs) 37s 2m 36s
ApplyVQSR (for INDELs) 39s 2m 35s

Note: The scattered tools are not the Spark version, but are just scattered on intervals from a custom BED file. The FASTQs used for these runs are synthetic exome files (just for quick and easy testing purposes). The computation was done in cloud, not on a local machine. The roughly same runtimes were acquired after reruns.

I haven't found anything in the release notes that would suggest these results. Are these runtimes to be expected?

Thank you for your help!

Tagged:

Issue · Github
by Sheila

Issue Number
2440
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @teodora_aleksic
    Hi,

    Let me check with the team and get back to you.

    -Sheila

    Issue · Github
    by Sheila

    Issue Number
    3491
    State
    open
    Last Updated
    Assignee
    Array
    Milestone
    Array
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @teodora_aleksic
    Hi,

    Thanks for reporting this. I just made an issue ticket which you can keep track of here.

    -Sheila

  • LouisBLouisB Broad InstituteMember, Broadie, Dev

    Hi @teodora_aleksic,

    The numbers you're reporting are very worrying and we'd like to get to the bottom of this. We have an open ticket here https://github.com/broadinstitute/gatk/issues/3491 to track this. On my machine I don't see any major performance difference between beta.2 and beta.3. It seems like it must be something about your data or the way you're running the tools that's making it different than the way I'm running them.

    Would it be possible for you to upload an example command line and dataset to either our ftp server or a public google cloud bucket?

    I want to check though, is it possible that you ran the initial runs on a different machine type than the newer ones, or that something changed in the cloud environment? One thing to check is if you were possibly running on Intel machines originally and on AMD or another brand of machine in the newer runs. We have performance optimizations for intel machines that could produce a change like this. If you rerun with the old version do you get the fast results again?

    Thank you,
    Louis

  • teodora_aleksicteodora_aleksic SerbiaMember

    Hi,

    Sorry for the late response, we went through all our runs again to double check everything.

    We tried some debugging and saved logs that the BaseRecalibrator prints to stdout. What we saw from the logs of the executions that we ran on cloud for 4.beta.2 was the following:

    14:28:42.527 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/gatk-package-4.beta.2-local.jar!/com/intel/gkl/native/libgkl_compression.so
    14:28:42.555 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 5
    14:28:42.555 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    14:28:42.555 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
    14:28:42.555 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    14:28:42.555 INFO  BaseRecalibrator - Deflater: IntelDeflater
    14:28:42.555 INFO  BaseRecalibrator - Inflater: IntelInflater
    14:28:42.555 INFO  BaseRecalibrator - Initializing engine
    

    And for 4.beta.3:

    14:33:39.416 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/gatk-package-4.beta.3-local.jar!/com/intel/gkl/native/libgkl_compression.so
    14:35:46.843 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 5
    14:35:46.843 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    14:35:46.843 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
    14:35:46.844 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    14:35:46.844 INFO  BaseRecalibrator - Deflater: IntelDeflater
    14:35:46.844 INFO  BaseRecalibrator - Inflater: IntelInflater
    14:35:46.844 INFO  BaseRecalibrator - GCS max retries/reopens: 20
    14:35:46.844 INFO  BaseRecalibrator - Using google-cloud-java patch 317951be3c2e898e3916a4b1abf5a9c220d84df8
    14:35:46.844 INFO  BaseRecalibrator - Initializing engine
    

    We noticed there being a 2 min delay between the first two entries in the 4.beta.3 log which doesn't exist in the 4.beta.2 log. There aren't any other inconsistencies between the timestamps in the rest of the logs. We went back through all our executions and noticed the 2 min difference between 4.beta.2 and 4.beta.3 for all of the tools that we ran. Just to avoid confusion, for the tools that we scattered manually, the time difference is also 2 min for each scattered instance of the tool. It only appears larger in the table I provided because of the way those instances stack up in the end.

    The results are replicable. We used the exact same sets of files and exact same sets of parameters for both 4.beta.2 and 4.beta.3. Both were executed on the same machines in cloud. The images we used for both versions were built using your broadinstitute/gatk:gatkbase-1.1 image and the already compiled jar files provided in the official releases on your GitHub.

    I hope this information is helpful.

    Thank you,

    Teodora

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @teodora_aleksic
    Hi Teodora,

    Would you be able to share some data with us? As Louis said, we cannot replicate this with our own data. Instructions for submitting are here.

    Thanks,
    Sheila

  • LouisBLouisB Broad InstituteMember, Broadie, Dev

    @teodora_aleksic Thanks for the additional information. The 2 minute pause is very suspicious. It sounds suspiciously like it's trying some sort of network operation and then waiting for it to time out (which it shouldn't be...). I'm out for a few days but I'll look into that possibility when I get back.

    Are all of your inputs local files? If you can't provide your actual inputs it might be helpful to just provide the command line you're running.

  • teodora_aleksicteodora_aleksic SerbiaMember

    Hi,

    The command lines we used for both 4.beta.2 and 4.beta.3 are:

    BaseRecalibrator:

    java -Xmx2048M -jar gatk-package-4.beta.3-local.jar BaseRecalibrator
    --output table.grp 
    --createOutputBamIndex 
    --knownSites 1000G_phase1.indels.b37.vcf 
    --knownSites Mills_and_1000G_gold_standard.indels.b37.sites.vcf 
    --knownSites dbsnp_137.b37.vcf 
    --input input.bam 
    --reference human_g1k_v37_decoy.fasta 
    --intervals 20
    

    ApplyBQSR:

    java -Xmx2000M -jar gatk-package-4.beta.3-local.jar ApplyBQSR 
    --output output.bam 
    --createOutputBamIndex 
    --bqsr_recal_file table.grp 
    --input input.bam 
    --intervals interval_1.bed
    

    HaplotypeCaller:

    java -Xmx2200M -jar gatk-package-4.beta.3-local.jar HaplotypeCaller 
    --output output.g.vcf 
    --input input.bam 
    --reference human_g1k_v37_decoy.fasta 
    --dbsnp dbsnp_137.b37.vcf 
    --genotyping_mode DISCOVERY 
    --interval_set_rule UNION 
    --standard_min_confidence_threshold_for_calling 30 
    --emitRefConfidence GVCF 
    --intervals interval_1.bed
    

    GenotypeGVCFs:

    java -Xmx2000M -jar gatk-package-4.beta.3-local.jar GenotypeGVCFs 
    --output output.vcf 
    --variant variant.g.vcf 
    --reference human_g1k_v37_decoy.fasta 
    --createOutputVariantIndex
    

    SNP VariantRecalibrator:

    java -Xmx2048M -jar gatk-package-4.beta.3-local.jar VariantRecalibrator 
    --rscript_file file.recal.R 
    --tranches_file file.tranches 
    --output output.recal 
    --use_annotation QD 
    --use_annotation MQRankSum 
    --use_annotation FS 
    --use_annotation DP 
    --use_annotation ReadPosRankSum 
    --use_annotation SOR 
    --variant variant.vcf 
    --resource 1000G,prior=10,truth=true,training=true,known=false:1000G_phase1.snps.high_confidence.b37.vcf 
    --resource omni,prior=12,truth=true,training=true,known=false:1000G_omni2.5.b37.vcf 
    --resource hapmap,prior=15,truth=true,training=true,known=false:hapmap_3.3.b37.vcf 
    --resource dbsnp,prior=2,truth=false,training=false,known=true:dbsnp_137.b37.vcf 
    --mode SNP 
    --reference human_g1k_v37_decoy.fasta 
    --maxGaussians 4
    

    INDEL VariantRecalibrator:

    java -Xmx2048M -jar gatk-package-4.beta.3-local.jar VariantRecalibrator 
    --rscript_file file.recal.R 
    --tranches_file file.tranches 
    --output output.recal 
    --use_annotation DP 
    --use_annotation FS 
    --use_annotation ReadPosRankSum 
    --use_annotation MQRankSum 
    --use_annotation QD 
    --use_annotation SOR 
    --variant variant.vcf 
    --resource dbsnp,prior=2,truth=false,training=false,known=true:dbsnp_137.b37.vcf 
    --resource mills,prior=12,truth=true,training=true,known=false:Mills_and_1000G_gold_standard.indels.b37.sites.vcf 
    --mode INDEL 
    --reference human_g1k_v37_decoy.fasta 
    --maxGaussians 4
    

    SNP ApplyVQSR:

    java -Xmx2048M -jar gatk-package-4.beta.3-local.jar ApplyVQSR 
    --output output.vcf 
    --recal_file file.recal 
    --variant variant.vcf 
    --mode SNP 
    --reference human_g1k_v37_decoy.fasta 
    --tranches_file file.tranches 
    --ts_filter_level 99.5
    

    INDEL ApplyVQSR:

    java -Xmx2048M -jar gatk-package-4.beta.3-local.jar ApplyVQSR 
    --output output.vcf 
    --recal_file file.recal 
    --variant variant.vcf 
    --mode INDEL 
    --reference human_g1k_v37_decoy.fasta 
    --tranches_file file.tranches 
    --ts_filter_level 99
    

    We are running the tools here directly from the JAR file, but we got the same execution times when running from the gatk-launch executor. We tried running with about 4 or 5 different data sets by now and got the same 2 min delay on 4.beta.3 with each data set, so we're pretty confident that's not what causing it. The files are always uploaded to the cloud before we start running the tools.

    Since you said you couldn't replicate the same execution times locally, we also tried running on our local machine and did not get the 2 min delay on 4.beta.3. as we did in the cloud. But, beside the Amazon instances we used for running in the cloud originally, we also tried running on Google instances and got the 2 min delay as we did on the Amazon instances for 4.beta.3.

    I hope this information is useful.

    Thanks,

    Teodora

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @teodora_aleksic
    Hi Teodora,

    I just let Louis know. He should be in touch soon.

    Thanks,
    Sheila

  • LouisBLouisB Broad InstituteMember, Broadie, Dev

    @teodora_aleksic Thanks for the additional information. I tried running very similar commands with my own data on a google cloud machine that I set up myself and couldn't reproduce the issue. I suspect there is some configuration in your cloud environment that is different from the one I set up, and which is triggering the issue.

    Can you explain how you configure your machine on google cloud? If you can put together a gcloud command that sets up a machine that has this problem in a clean project, then I could debug with that.

    Alternatively, and I understand this may not be possible, if you could give me ssh access to one of your cloud machines then I could do some manual debugging.

    This seems like a serious problem, but until I can reproduce it it's hard to debug.

    If you can ssh into the node and run jstack while it's in that 2 minute pause, the thread dump might point us to the problem.

  • teodora_aleksicteodora_aleksic SerbiaMember

    Hi @LouisB , unfortunately I'm not able to provide you access to the cloud machines, but I hope the thread dumps will be enough . The machines are running Ubuntu 14.04 with Docker 1.12.6. I'll see if I can get you additional information about the machine setups. The thread dumps are:

    Beta 2:

    2017-09-07 10:25:00
    Full thread dump OpenJDK 64-Bit Server VM (25.131-b11 mixed mode):
    
    "Attach Listener" #13 daemon prio=9 os_prio=0 tid=0x00007fab58001000 nid=0x7d waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "process reaper" #12 daemon prio=10 os_prio=0 tid=0x00007fabadc15800 nid=0x66 waiting on condition [0x00007fab980e1000]
       java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000006d89b3670> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
        at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
    
       Locked ownable synchronizers:
        - None
    
    "Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007fabac211800 nid=0x62 runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007fabac204800 nid=0x61 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fabac200000 nid=0x60 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fabac1fe000 nid=0x5f waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fabac1fb000 nid=0x5e waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fabac1f9000 nid=0x5d runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fabac1d3000 nid=0x5c in Object.wait() [0x00007fab987e8000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000006d501b190> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
        - locked <0x00000006d501b190> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
    
       Locked ownable synchronizers:
        - None
    
    "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fabac1ce000 nid=0x5b in Object.wait() [0x00007fab988e9000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000006d501b360> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
        - locked <0x00000006d501b360> (a java.lang.ref.Reference$Lock)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
    
       Locked ownable synchronizers:
        - None
    
    "main" #1 prio=5 os_prio=0 tid=0x00007fabac00a000 nid=0x51 runnable [0x00007fabb35c5000]
       java.lang.Thread.State: RUNNABLE
        at org.broadinstitute.hellbender.utils.recalibration.covariates.ContextCovariate.recordValues(ContextCovariate.java:68)
        at org.broadinstitute.hellbender.utils.recalibration.covariates.StandardCovariateList.recordAllValuesInStorage(StandardCovariateList.java:133)
        at org.broadinstitute.hellbender.utils.recalibration.RecalUtils.computeCovariates(RecalUtils.java:546)
        at org.broadinstitute.hellbender.utils.recalibration.RecalUtils.computeCovariates(RecalUtils.java:527)
        at org.broadinstitute.hellbender.utils.recalibration.BaseRecalibrationEngine.processRead(BaseRecalibrationEngine.java:136)
        at org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator.apply(BaseRecalibrator.java:180)
        at org.broadinstitute.hellbender.engine.ReadWalker.lambda$traverse$0(ReadWalker.java:96)
        at org.broadinstitute.hellbender.engine.ReadWalker$$Lambda$93/623545006.accept(Unknown Source)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
        at org.broadinstitute.hellbender.engine.ReadWalker.traverse(ReadWalker.java:94)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:838)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:189)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
        at org.broadinstitute.hellbender.Main.main(Main.java:230)
    
       Locked ownable synchronizers:
        - None
    
    "VM Thread" os_prio=0 tid=0x00007fabac1c6800 nid=0x5a runnable
    
    "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fabac01f000 nid=0x52 runnable
    
    "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fabac020800 nid=0x53 runnable
    
    "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fabac022800 nid=0x54 runnable
    
    "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fabac024000 nid=0x55 runnable
    
    "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fabac026000 nid=0x56 runnable
    
    "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fabac027800 nid=0x57 runnable
    
    "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fabac029800 nid=0x58 runnable
    
    "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fabac02b000 nid=0x59 runnable
    
    "VM Periodic Task Thread" os_prio=0 tid=0x00007fabac214000 nid=0x63 waiting on condition
    
    JNI global references: 516
    

    Beta 3:

    2017-09-07 09:51:31
    Full thread dump OpenJDK 64-Bit Server VM (25.131-b11 mixed mode):
    
    "Attach Listener" #13 daemon prio=9 os_prio=0 tid=0x00007ffb10001000 nid=0x3e waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "process reaper" #12 daemon prio=10 os_prio=0 tid=0x00007ffb61bb5800 nid=0x27 waiting on condition [0x00007ffb640d9000]
       java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000775afb4b0> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
        at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
    
       Locked ownable synchronizers:
        - None
    
    "Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007ffb60211800 nid=0x23 runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007ffb60204800 nid=0x22 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007ffb60200000 nid=0x21 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007ffb601fe800 nid=0x20 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007ffb601fb800 nid=0x1f waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007ffb601f9800 nid=0x1e runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
       Locked ownable synchronizers:
        - None
    
    "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007ffb601d2800 nid=0x1d in Object.wait() [0x00007ffb4a8e7000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000006d50982a0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
        - locked <0x00000006d50982a0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
    
       Locked ownable synchronizers:
        - None
    
    "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007ffb601ce000 nid=0x1c in Object.wait() [0x00007ffb4a9e8000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000006d50a0e50> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
        - locked <0x00000006d50a0e50> (a java.lang.ref.Reference$Lock)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
    
       Locked ownable synchronizers:
        - None
    
    "main" #1 prio=5 os_prio=0 tid=0x00007ffb6000a000 nid=0x12 runnable [0x00007ffb697af000]
       java.lang.Thread.State: RUNNABLE
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        - locked <0x0000000775fbde00> (a java.net.SocksSocketImpl)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
        - locked <0x0000000775fbd170> (a sun.net.www.http.HttpClient)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
        at sun.net.www.http.HttpClient.New(HttpClient.java:339)
        at sun.net.www.http.HttpClient.New(HttpClient.java:357)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1546)
        - locked <0x0000000775fbaca8> (a sun.net.www.protocol.http.HttpURLConnection)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
        - locked <0x0000000775fbaca8> (a sun.net.www.protocol.http.HttpURLConnection)
        at com.google.cloud.MetadataConfig.getAttribute(MetadataConfig.java:65)
        at com.google.cloud.MetadataConfig.getProjectId(MetadataConfig.java:41)
        at com.google.cloud.ServiceOptions.getGoogleCloudProjectId(ServiceOptions.java:376)
        at com.google.cloud.ServiceOptions.getDefaultProjectId(ServiceOptions.java:311)
        at com.google.cloud.ServiceOptions.getDefaultProject(ServiceOptions.java:284)
        at com.google.cloud.ServiceOptions.<init>(ServiceOptions.java:238)
        at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:82)
        at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:30)
        at com.google.cloud.storage.StorageOptions$Builder.build(StorageOptions.java:77)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.setGlobalNIODefaultOptions(BucketUtils.java:355)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:152)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
        at org.broadinstitute.hellbender.Main.main(Main.java:233)
    
       Locked ownable synchronizers:
        - None
    
    "VM Thread" os_prio=0 tid=0x00007ffb601c6800 nid=0x1b runnable
    
    "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007ffb6001f000 nid=0x13 runnable
    
    "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007ffb60020800 nid=0x14 runnable
    
    "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007ffb60022800 nid=0x15 runnable
    
    "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007ffb60024000 nid=0x16 runnable
    
    "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007ffb60026000 nid=0x17 runnable
    
    "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007ffb60027800 nid=0x18 runnable
    
    "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007ffb60029800 nid=0x19 runnable
    
    "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007ffb6002b000 nid=0x1a runnable
    
    "VM Periodic Task Thread" os_prio=0 tid=0x00007ffb60214000 nid=0x24 waiting on condition
    
    JNI global references: 358
    

    Thanks for your help.

  • LouisBLouisB Broad InstituteMember, Broadie, Dev

    @teodora_aleksic Thanks for all the help. It looks like this is being cause by some bad interaction in the code we use to support reading from google cloud. When we initialize it, it tries to look up project metadata for your google cloud project, but in some cases that fails and waits for a long time before it times out. I'm not sure if we'll need a fix in our code or in the google library code yet, but we now have a path to fix this annoying issue.

  • teodora_aleksicteodora_aleksic SerbiaMember

    @LouisB That's great news! Thank you for your help and making this high priority.

Sign In or Register to comment.