Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
kshakir ✭✭
About
- Username
- kshakir
- Joined
- Visits
- 163
- Last Active
- Roles
- Broadie, Dev
- Points
- 153
- Badges
- 14
- Full Name
- Khalid Shakir
Reactions
Comments
-
Hi @amaro - Can you share the workspace with the failed job with [email protected] and I'll take a look at the logs? Thanks!
-
Maybe as a workaround the command {} block could use capture the return code using sh, check for the 143 and then exit with something else. python -c "exit(143)"EXIT_STATUS=$?if [ "${EXIT_STATUS}" -eq 143 ]; then exit 0else ex…
-
The change in behavior was related to this closed GitHub issue. If you have a similar feature you would like to have (re-)implemented please open a new issue. In your new GitHub issue please fully describe how you would like to use the feature, and …
-
The auto-generated shell environment variable $TMPDIR was was too long for your particular job. As a workaround try adding export TMPDIR=/tmp at the top of your task command. Ticket filed in cromwell with more info: https://github.com/broadinstitu…
-
Talked in person. It looks like the exit code 137 was docker SIGKILL'ing jobs. https://github.com/moby/moby/issues/21083 From early testing, docker-for-mac's default of 2gb memory was not enough for some workflow runs. Raising the settings to 8gb …
-
Hi @Redmar_van_den_Berg, Automating submissions to Cromwell is beyond the scope of these particular support forums. Still in my limited experience I've seen a couple of cases where custom software tools were built to help submit to Cromwell's REST …
-
Hi @jmichael, The issue you're running into is being tracked here: https://github.com/broadinstitute/cromwell/issues/1499 Please follow and contribute to the discussion there if you'd like. Thanks
-
I'm not sure I 100% follow the proposed workflow, but it's possible one may get 80% of what is described by using GATK4 GCS NIO features as it supports streaming BAM data. That way a workflow would save on localizing input BAM data during each step …
-
Have you had a chance to try out script-epilogue generation of md5's? I'm curious if this works for your issue in Cromwell 30.
-
(Quote) Both output paths and job metadata are store in the database. In general the path to a bam and the path to an index are the same amount of memory for Cromwell to keep track of temporarily and within the database. NOTE: There are specific ca…
-
The WDL spec for draft-2 is closed, and doesn't explicitly state whether newlines should be stripped during read_tsv. I encourage you to submit a proposal if you'd like for the next specification of WDL to make the behavior explicit such that read_t…
-
The default Local backend does not support limiting resources such as memory and cpu. Once you switch to a different backend you can use runtime-attributes to communicate to a job scheduler what resources are required before a job should be executed…
-
Thanks for the detailed report. I suspect your issue is related to these two lines in the WDL task and then the input JSON respectively: File? refAlt "complete.bwaaln.refAlt": "", The empty string is filling in an explicit va…
-
I believe the issue of ignoring the cromwell folder structure may have already been filed as an enhancement request. If so, please feel free to follow and comment on the situation here: https://github.com/broadinstitute/cromwell/issues/1641 Regardi…
-
As mentioned above the various issues in this post have been migrated over as enhancement requests. Please continue the discussion and follow the statuses on GitHub: * https://github.com/broadinstitute/cromwell/issues/2592 * https://github.com/broa…
-
Cromwell doesn't support this level of functionality for Local backends, but it can work with a job scheduler such as Grid Engine / SLURM / TORQUE / etc. to support resource requirements. WDL allows one to specify runtime attributes per task. One c…
-
Return code 247 doesn't mean anything specific to cromwell and may be specific to your task's command and the environment where the call is running. The first place to check would be the captured stderr and/or the stdout files for your failed call.…
-
A Pair only contains two members left and right. One can zip-a-zip though to create a structure with three elements. For example: workflow zip3 { Array[String] a = ["1a", "2a", "3a"] Array[String] b = ["1b&…
-
The backend.providers.MyHPCBackend is an example. For those using a Local backend, the MyHPCBackend should be Local. Assuming one's config file is picked up by cromwell this should allow the Local backend to also access GCS. NOTE: If one doesn't ha…
-
Yes. You would need to have a config file setup for an LSF backend, as the default backend, and then you could run: java \ -Dconfig.file=my.lsf.conf \ -jar cromwell.jar \ run my_pipeline.wdl -I inputs.json
-
For a portable pipeline that may be shared with others running on different systems, I do NOT recommend any of this, but-- There are some steps one could do to avoid cromwell copying inputs. To refer to an existing file input without localizing at …
-
Can you create a new forum post describing more of what you're trying to do / what you expect for caching / what you're currently seeing? There may be an opportunity for a feature that could benefit the wider community. In the meantime, the script …
-
Interesting. What is producing paths like this? (Quote) There are a number of file system errors occurring in the logs you pasted. This may just be an path problem, where you need to specify "more unix-like" paths starting with a /, as /…
-
Hi @manidr, Can you repost your cna_analysis.wdl? I only see 111 lines as posted, yet the error posted says "Unrecognized token on line 123". To test cromwell's parsing, I ran with the following inputs.json and jids.txt and did not get t…
-
This should maybe be migrated to the WDL forum? Issue tracked here with a workaround: https://github.com/broadinstitute/wdl4s/issues/230
-
Hi @dannykwells, Unfortunately we have seen this before. We do not have a good workaround for this situation at the moment. Please follow this ticket for more info and status on the issue.
-
Hi @Redmar_van_den_Berg, Re: (Quote) In the source snippet you mentioned, the SCRIPT_EPILOGUE that runs after INSTANTIATED_COMMAND is configurable. See this section in the example conf. Something like script-epilogue = "chmod -R a+r * &&…
-
Thanks Chris!
-
Hi @alongalor, I'm not sure what became of this original ticket, but in general this error usually occurs as stated above: (Quote) One example I saw just today was a python tool that was configured to write to the root directory / instead of the c…
-
Hi @Hannah66, I haven't used -hold_jid with grid engine before. But looking at google, it appears that command line option specifies the dependencies between jobs inside grid engine. This seems like it would be redundant in the current versions of …
-
Hi @Hannah66, Here's a basic example that runs a hello.wdl using a custom config file on our local SGE cluster. NOTE: Your cluster will likely have different requirements. For example our cluster requires that we specify an -l h_vmem=<memory>…
-
Hi @Hannah66, can you submit this as a new question, so that future visitors of the forum can follow the separate discussion? Thanks!
-
(Quote) I'm not aware of tweaks for the db that relate to reducing memory usage. Most of the tweaks I can think of involve increasing internal queue and batch sizes, where more data is held in memory. The goal of those bigger sql batches is to decre…
-
Providing a default for an optional is redundant, as the optional variable will therefore never be empty. Try out: workflow wf { Boolean is_real = false}
-
Hi @dayzcool, Without details I'm not sure what you mean by underperforming and hanging. For example, cromwell uses combinations of polling and queueing when communicating with the database, that adds some latency to when jobs will start (2 seconds…
-
Hi @idan_g, The README probably has the best explanations that exists, but the simplest example I can think of is the integration tests that cromwell uses. (Quote) -Dconfig.file=myConfig.conf NOTE The first line of your config must be #include &q…
-
Hi @jweinstk, Talking to a couple others, for the size/scope of a workflow that you're trying to run, perhaps you can try and reduce the number of Files cromwell tracks using file-of-file-filenames (fofn). Basically, store the list of files one wan…
-
Hm. I haven't seen this one before. Are you running a relatively recent mysql (5.6+), or have you customized your database in any way (changing the default collation, etc.)? I only ask because this has been working on other databases and without a r…
-
Sorry you ran into this. Confirmed that this was fixed back in version 27. We're also adding a test to ensure Cromwell always returns the error in the future.
-
Hi @awacs, We'll need a little more info to narrow down your specific question. Specifically: * What version of cromwell are you running? java -jar cromwell.jar -version (--version starting in 29) * What does your application.conf's backend stanza…
-
Hi @mxqian, As you're seeing there are significant differences in the design and features of Queue and Cromwell. Some features of cromwell that may be of use for your particular situation: * Cromwell may use a persistent MySQL (compatible, includi…
-
If one is defining a new backend, say LSF, or SGE, then the entire configuration must be specified. However, the Local backend is already defined by Cromwell's internal reference configuration. Change your backend stanza to specify only the values y…
-
In these config based backends that write to a shared file system, such as LSF, the two wdl runtime attributes cpu and memory may be translated to the built-in variables cpu and memory_<unit>, respectively for your submit command. See this sec…
-
Hi @afrieden, GATK 3.x releases are not currently published to Central. But it is possible to install the GATK into your local repository, where Maven can then pick up the GATK as a dependency. You mentioned you had a license, and aimed to "c…
-
@TechnicalVault The exception you mentioned is originating from within HTSJDK. https://github.com/samtools/htsjdk/blob/3d13016bef282595328d4eb6eed9c71d885a0347/src/java/htsjdk/samtools/cram/structure/ReadTag.java#L380 The debug discussion should m…
-
The GATK 3.4 release added .cram read/write support using HTSJDK 1.132. Via the HTSJDK defaults, the GATK 3.4 writes .cram.bai. However, GATK 3.4 does not require these files upon reading, as it does not use the index, and instead always seeks into …
-
TL;DR: mvn -Ddisable.shadepackage verify Background: In addition to Queue's GATK-wrapper codegen, relatively slow scala compilation, etc. there's still a lot of legacy compatibility from our ant days in the Maven scripts. Our mvn verify behaves mo…
-
Kind of like using multiple BAMs or multiple VCFs, if you need to scalac multiple files, you'll have to -S each of them. Java 7's globbing support isn't enabled in the GATK argument parsing, so you'll have to specify each separately. If it's two or…
-
For debugging purposes, can you run a version of this script with a smaller number of inputs, on the order of 5 or 50, instead of 50 thousand? It would help figure out what's going on at this point in the script. Based on the stack dump at that sin…
-
import org.broadinstitute.gatk.queue.function._ should work.
-
Thanks for letting us know, and working out a potential fix! At this point, can you move this conversation to a pull request on github? I think the important bits of your patch were truncated from the end of your "else if" statement. On g…
-
Can you try a) running mvn dependency:purge-local-repository and/or b) installing maven from a tarball? I googled the stack trace and StringUtils Maven and found these discussions with possible suggestions: * https://github.com/airbnb/chronos/issu…
-
Hi Johan, Is it possible to capture the output of 'mvn -X verify' and attach it as a file here or email it? Admittedly it's hard for me to locate a clean environment to test this on, so I'm stabbing in the dark as the git clone and then mvn packag…
-
Contract of the scatterContigIntervals: Pre conditions: * locs[1..x] are sorted by contig then start position * locs[x].stop() < locs[x+1].start(), aka no overlapping locs * locs.size() <= scatterParts.size() * scatterParts[1..y] are in a sp…
-
List[File] are supported as @Input or @Output. Actually the trait / interface Seq[File] that List[File] concretely implements is what's supported, so List[File] works too, but I digress. However, Array[File] are NOT currently supported as @Input or…
-
Interesting. When you list the threads, either in jstack output, or via java's Ctrl-\ and kill -3 SIGQUIT handler, how many are actually GATK?
-
Hi Johan, Do you have an example script we could look at, along with expected command lines to be generated? Phil's _JAVA_OPTIONS and JAVA_TOOL_OPTIONS may just work too, but I haven't used them myself. As a quick-and-dirty alternative, I attached…
-
The READ and CONTIG scatter are not currently optimized based on contig size, but could be. See IntervalUtils for the current implementation of scatterContigIntervals, and the unit tests for example invocations.
-
If your walker overloads isReduceByInterval() and returns true, then engine will invoke onTraversalDone(List<Pair<GenomeLoc, ReduceType>> results). A simple walker using this design is GCContentByInterval: https://github.com/broadgsa/g…
-
Hi @kmsguire, I'm trying to reproduce the issue with -nt, but in our test cases I'm not having luck recreating different outputs with and without -nt. At the moment, I'm still looking for a test case so we can further diagnose your reported bug. I…
-
Hi @leeyoungwha, I'm trying to reproduce the issue with -nt, but in our test cases I'm not having luck recreating different outputs with and without -nt. If you and @flescai can share your input vcfs that produce errors, along with log files produ…
-
Nothing elegant. Tagging is possible on the function level, but not for the scripts. QScripts are currently fed through the generic Sting argument parser, where as you see tags are only sent to the GATK. In your script, the basic way one could inpu…
-
Short answer: Neither -g, nor -G if anyone else is searching, are supported through the Queue interfaces and there currently isn't a way to pass in native args from the command line to the LSF strongly typed API. At the moment we don't have the res…
-
The job names as command lines are less and less important for display purposes when checking job status with programs like qstat and bjobs. The biggest example is that we even considered using job arrays for submission. At that point if the jobs a…
-
Most of the queue.extensions are generated classes once you run 'ant'. The generated classes are Queue compatible versions of the walker command arguments. Instead of looking at the generated scala file you can find the same or better argument descr…