DiscoverVariantsFromContigAlignmentsSAMSpark error

Hello there!

I am using "DiscoverVariantsFromContigAlignmentsSAMSpark" to call SNPs on contigs from an assembly. The assembly was done by Falcon for Pacbio reads. While running the command, I receive an interval error for the SAM file. In order to run the command successfully, I've already mapped the contigs to the reference genome, using minimap2. I've converted resulting CRAM file to a SAM file, added read group info there and sorted the SAM file. I also ran "ValidateSamFile" successfully. I am running latest version of GATK. Here are the commands and part of the error message:

Commnad:

gatk DiscoverVariantsFromContigAlignmentsSAMSpark -R $ref_genome -I $input_file -O ${input_file}_gatk_Vcalled.vcf

Error:

ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 162)
java.lang.IllegalArgumentException: Invalid interval. Contig:16 start:46400198 end:46400197
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:687)
at org.broadinstitute.hellbender.utils.SimpleInterval.validatePositions(SimpleInterval.java:61)
at org.broadinstitute.hellbender.utils.SimpleInterval.(SimpleInterval.java:37)
at org.broadinstitute.hellbender.tools.spark.sv.discovery.alignment.ContigAlignmentsModifier.splitGappedAlignment(ContigAlignmentsModifier.java:310)
at org.broadinstitute.hellbender.tools.spark.sv.discovery.SvDiscoverFromLocalAssemblyContigAlignmentsSpark$SAMFormattedContigAlignmentParser.lambda

Any help on resolving the issue is appreciated!

Issue · Github
by Sheila

Issue Number
3097
State
open
Last Updated
Assignee
Array

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @truns
    Hi,

    I will check with the team and get back to you.

    -Sheila

  • trunstruns Member

    Hello @Sheila

    I was wondering if you have got a chance to look into this issue and checked it with the team.

    Thank you and best!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @truns
    Hi again,

    Sorry I have not heard from the developer, but I am messaging him again now. I will respond asap. Sorry for the delay, and thanks for posting again.

    -Sheila

  • shuangBroadshuangBroad Broad75Member, Broadie, Dev

    Hi @truns , thanks for reporting the issue and I'm sorry for the delay.

    It looks like a possible edge case that we haven't seen before (and we haven't tested this tool with PacBio reads yet either), so it would be great if you can share the alignments that are overlapping this region:
    Contig:16 start:46400198 end:46400197.

    Also, it seems that you are using DiscoverVariantsFromContigAlignmentsSAMSpark for discovering SNP's, which the tool is not designed for (admittedly, the tool is badly named for not being explicit that it is for structural variation).

    Thanks!

Sign In or Register to comment.