I get an ArrayIndexOutOfBoundsException when using this tool (see attached log).
Is there a possible workaround or a new version I could use?
Can you share a snippet of the data that is causing the error?
If so, instructions on how to upload are here: http://gatkforums.broadinstitute.org/discussion/1894/how-do-i-submit-a-detailed-bug-report
sure, I will do so as soon as I am back in the office in the next days.
A general question:
Is there a systematic testing framework (like Unit tests) for GATK? I know it is a huge software, but I experience such problems quite often and many of them could be avoided by systematic tests before the release.
The GATK development process includes a comprehensive testing framework (including unit tests, integration tests, regression tests etc). There is one set of tests that are run every single time code modifications are pushed to a development branch, and runs on the master branch every time a feature branch is merged, and another set that is run before release. Most 'real' bugs you are likely to encounter are due to weird corner cases that are not accounted for in our tests. When that happens, the bug fix includes adding a test for that corner case so it can't happen again. But the majority of issues we see on the forum are due to dirty data, formatting errors and so on. We do our best to catch those and return meaningful errors to help people clean up their data, but ultimately those are the user's responsibility.
thanks for the explanation. That sounds like a good approach, although I would like to add that a good test case should explicitly cover "corner cases" ;-). But I know that this is theory of course, and resources are limited in reality. For such a complex software, it is always a tradeoff between features and testing.
I nailed down the problem to a very long insert:
19 25359195 . T TGTCTTAGTGTCGTATGGGCTTCCATATTCAACTTTCTTCTATAAGTAGAATATCTACACGTGATGCTCTGGTCTTTCTACTACACGTCTATTGTAGCTTAATCTTTTCCTCGGGGGGGAAGAGGCTGTTTAGAATCATACCTCCAACCGACATTAACCCTGTTGGATTATAACTAGGGGCAAATCCGGATATTGGTATACGGCCCTATTATTCTATGGGTCTACGCACCTAGTAGGCACCCACGCAGTCTCTGGCCCGACCCCA 99 . AC=2;LEN=264;NA=1;NS=1;TYPE=ins GT 1|1
Ah, are you saying that this long insertion causes the exception?
I see -- could you please submit a bug report (following the instructions here: http://www.broadinstitute.org/gatk/guide/article?id=1894) so we can reproduce and fix the issue? And add a new test case
I have uploaded the required data to your ftp as ftp://email@example.com/johanneskoester-SimulateReadsForVariants-bug.tar.gz
Thanks Johannes, we'll process your bug on Tuesday (Monday is a US holiday).
I have submitted a bug report. Hopefully it will be fixed soon.
Thanks for narrowing down the error to a single site!
This bug should be fixed in the latest nightly build. Please download it here: https://www.broadinstitute.org/gatk/nightly