Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

[GATK 4 beta] Allele in genotype T not in the variant context exception in running Mutect2 wdl

Hello,

It looks that FilterByOrientationBias tool gives the following exception:
java.lang.IllegalStateException: Allele in genotype T not in the variant context [G*, A]
Would you give any advise on it? Thanks a lot!

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=cromwell-executions/Mutect2_Multi/7daa6f73-b6df-4df9-a3d9-59a8741bbd1c/call-Mutect2/shard-8/Mutect2/09450ff5-b1cd-48a1-b4e0-0bb0e077cab0/call-Filter/execution/tmp.5fk9Hh
[July 12, 2017 5:25:27 PM CDT] FilterByOrientationBias  --output ob_filtered.vcf --preAdapterDetailFile cromwell-executions/Mutect2_Multi/7daa6f73-b6df-4df9-a3d9-59a8741bbd1c/call-Mutect2/shard-8/Mutect2/09450ff5-b1cd-48a1-b4e0-0bb0e077cab0/call-Filter/inputs/cromwell-executions/Mutect2_Multi/7daa6f73-b6df-4df9-a3d9-59a8741bbd1c/call-Mutect2/shard-8/Mutect2/09450ff5-b1cd-48a1-b4e0-0bb0e077cab0/call-CollectSequencingArtifactMetrics/execution/gatk.pre_adapter_detail_metrics --artifactModes G/T --artifactModes C/T --variant /gpfs/data/analysis/projects/thulium/somatic-mutation/mutect2-gatk4/exome/tumor-only/cromwell-executions/Mutect2_Multi/7daa6f73-b6df-4df9-a3d9-59a8741bbd1c/call-Mutect2/shard-8/Mutect2/09450ff5-b1cd-48a1-b4e0-0bb0e077cab0/call-Filter/inputs/cromwell-executions/Mutect2_Multi/7daa6f73-b6df-4df9-a3d9-59a8741bbd1c/call-Mutect2/shard-8/Mutect2/09450ff5-b1cd-48a1-b4e0-0bb0e077cab0/call-MergeVCFs/execution/AIRLBM-tumor-only.vcf  --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --readValidationStringency SILENT --secondsBetweenProgressUpdates 10.0 --disableSequenceDictionaryValidation false --createOutputBamIndex true --createOutputBamMD5 false --createOutputVariantIndex true --createOutputVariantMD5 false --lenient false --addOutputSAMProgramRecord true --addOutputVCFCommandLine true --cloudPrefetchBuffer 40 --cloudIndexPrefetchBuffer -1 --disableBamIndexCaching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --disableToolDefaultReadFilters false
[July 12, 2017 5:25:27 PM CDT] Executing as [email protected] on Linux x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11; Version: 4.beta.1-14-g9d9ca1f-SNAPSHOT
[July 12, 2017 5:26:08 PM CDT] org.broadinstitute.hellbender.tools.exome.FilterByOrientationBias done. Elapsed time: 0.68 minutes.
Runtime.totalMemory()=5816451072
java.lang.IllegalStateException: Allele in genotype T not in the variant context [G*, A]
        at htsjdk.variant.variantcontext.VariantContext.validateGenotypes(VariantContext.java:1360)
        at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1298)
        at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:401)
        at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:494)
        at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:488)
        at org.broadinstitute.hellbender.tools.exome.orientationbiasvariantfilter.OrientationBiasFilterer.annotateVariantContextsWithFilterResults(OrientationBiasFilterer.java:216)
        at org.broadinstitute.hellbender.tools.exome.FilterByOrientationBias.onTraversalSuccess(FilterByOrientationBias.java:211)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:840)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:189)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
        at org.broadinstitute.hellbender.Main.main(Main.java:230)

Best Answer

Answers

  • dayzcooldayzcool Member

    It doesn't seem to happen with gatk-package-4.beta.1-23-g7dba90e-SNAPSHOT

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    I think I remember a bug along those lines getting fixed. Sounds like you're all set then? To be clear, are you building from source? There is an official beta release package now.
  • dayzcooldayzcool Member

    I will use the official beta version and confirm that this issue is fixed as soon as I can. Thanks!

  • dayzcooldayzcool Member

    Hmm, I think I still need your help. below is same kind of error I got from the official beta version: 4.beta.1
    I learned that using the most recent snapshot (4.beta.1-23-g7dba90e) isn't always helping. The newest snapshot gives similar error for certain input.
    In summary, I am running Mutect2 using Multi2_Multi(?) wdl file on tens of samples in tumor only mode with scatter count of 100. I have tried three different builds; 4.beta.1 (official build downloaded from website), 4.beta.1-23-g7dba90e, 4.beta.1-14-g9d9ca1f.
    Each build gives this kind of error, but not always for same chunk of data. For example, the error below occurs when I use 4.beta.1 (official), or 4.beta.1-23-g7dba90e, but not 4.beta.1-14-g9d9ca1f. The original error posted only happens with 4.beta.1-14-g9d9ca1f

    On other thing I noticed is that this exception occurs when both -A C/T, -A G/T are specified, though I am not sure if it's always the case. Having either -A C/T or -A G/T didn't give me the exception for one test case.

    Hope it's clear, and feel free to advise if you have any questions.

    Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-Filter/execution/tmp.rZp0w5
    [July 13, 2017 11:27:24 AM CDT] FilterByOrientationBias  --output ob_filtered.vcf --preAdapterDetailFile cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-Filter/inputs/cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-CollectSequencingArtifactMetrics/execution/gatk.pre_adapter_detail_metrics --artifactModes G/T --artifactModes C/T --variant cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-Filter/inputs> /cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-MergeVCFs/execution/ACRLPB-tumor-only.vcf  --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --readValidationStringency SILENT --secondsBetweenProgressUpdates 10.0 --disableSequenceDictionaryValidation false --createOutputBamIndex true --createOutputBamMD5 false --createOutputVariantIndex true --createOutputVariantMD5 false --lenient false --addOutputSAMProgramRecord true --addOutputVCFCommandLine true --cloudPrefetchBuffer 40 --cloudIndexPrefetchBuffer -1 --disableBamIndexCaching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --disableToolDefaultReadFilters false
    [July 13, 2017 11:27:24 AM CDT] Executing as [email protected] on Linux 3.10.0-327.36.3.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11; Version: 4.beta.1
    [July 13, 2017 11:27:51 AM CDT] org.broadinstitute.hellbender.tools.exome.FilterByOrientationBias done. Elapsed time: 0.45 minutes.
    Runtime.totalMemory()=4281860096
    java.lang.IllegalStateException: Allele in genotype C* not in the variant context [G*, T]
            at htsjdk.variant.variantcontext.VariantContext.validateGenotypes(VariantContext.java:1360)
            at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1298)
            at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:401)
            at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:494)
            at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:488)
            at org.broadinstitute.hellbender.tools.exome.orientationbiasvariantfilter.OrientationBiasFilterer.annotateVariantContextsWithFilterResults(OrientationBiasFilterer.java:216)
            at org.broadinstitute.hellbender.tools.exome.FilterByOrientationBias.onTraversalSuccess(FilterByOrientationBias.java:211)
            at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:840)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
            at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:189)
            at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
            at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
            at org.broadinstitute.hellbender.Main.main(Main.java:230)
    
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @dayzcool,

    FilterByOrientationBias --output ob_filtered.vcf --preAdapterDetailFile cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-Filter/inputs/cromwell-executions/Mutect2_Multi/412c6e91-599e-4024-9011-6647638c6e9c/call-Mutect2/shard-2/Mutect2/4f24f6cf-259e-4a0d-9642-04f56fc89696/call-CollectSequencingArtifactMetrics/execution/gatk.pre_adapter_detail_metrics --artifactModes G/T --artifactModes C/T ...

    Can you try putting quotes around your --artifactModes, e.g. --artifactModes 'G/T'? This is the way we use the tool. I'm not sure if it makes a difference but it cannot hurt for you to try.

  • dayzcooldayzcool Member

    Hi @shlee, Thank you for your advice! It does not make a difference in this case unfortunately, but I'd single-quote the arguments when I use the tool in the future.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited July 2017

    @dayzcool.

    The error sounds like there is a mismatch between a resource and the data:

    • preadapterdetailfile
    • variant

    Let's rule out the remote possibility that this error is related to scattering over a WDL script. Can you recapitulate your error outside of the WDL run. That is, can you run FilterByOrientationBias locally, without scattering, and get the same error message?

    P.S. In the meanwhile, I've asked a developer to look into this.

    Post edited by shlee on
  • dayzcooldayzcool Member

    Hi @shlee, would it be same if I start over with scatter count of 1?
    BTW, I think it'll be surprising if it is an issue with scatter count.
    The error occurs for one GATK4 build, but not for another build. For instance, I tried to run FilterByOrientationBias locally using the command in the script file generated by cromwell. I used two different builds (4.beta.1 official, 4.beta.1-14-g9d9ca1f snapshot) and all the arguments for FilterByOrientationBias tool are same. There is a FilterByOrientationBias run where 4.beta.1 (official) fails and 4.beta.1-14-g9d9ca1f snapshot succeeds. Also, there is a FilterByOrientationBias run where 4.beta.1-14-g9d9ca1f snapshot fails and 4.beta.1 (official) succeeds.

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @dayzcool Can you send along a VCF that fails or the variants that induce the failure?

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @dayzcool @shlee As a warning, there have been no changes to FilterByOrientationBias since before 4.beta.1 ...

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    @dayzcool,

    Instructions for a bug report are in Article#1894. Let us know if the data is private.

    If all the input data and the commands are identical for your four scenarios, then the errors sound like they are random. Are your data and commands different for each failure? In this case, to narrow down the source of error, it would be helpful for us to see each command for the success and error buckets and to know whether the data for the command is identical or not for the runs.

    Also, are all the shards for a run failing or are there successful shards mixed in with the failed shards for a given run? Does one particular genomic interval consistently fail? Just to be clear, upstream of filtering, you are running CollectSequencingArtifactMetrics on the entirety of the sample BAM and NOT scattering?

    These questions are why I ask that you recapitulate your error without scattering. I think running with scatter interval of one should be fine for testing identical scenarios. In any case, you'll have to give us the exact commands that we can run locally on snippets of your files to recapitulate the error.

    Just for the record on whoever will follow up on this:

    @dayzcool:

    both -A C/T, -A G/T are specified, though I am not sure if it's always the case. Having either -A C/T or -A G/T didn't give me the exception for one test case.

    Tool doc:

    For a given base substitution specified with the --artifactModes argument, the tool considers both the forward and reverse complement for filtering. Do NOT specify artifact modes that are reverse complements of each other. Behavior of this tool is undefined when that happens. For example, do not specify C/A and G/T in the same run.

    You are filtering for a C --> T transition and G --> T transversion. You have not specified the reverse complements of these, which may have caused the tool to behave unpredictably. There is some expectation towards Ti/Tv ratios for types of sequencing data; typical sequencing data gives a TiTv ratio of ~2-3, which is higher than the random expectation of 0.5. Tool should not error because of lack of such scenarios.

    We are not entirely sure how to interpret the error message:

    Allele in genotype C* not in the variant context [G*, T]

    and will need to run a debugger to get to the root of it.

  • dayzcooldayzcool Member

    Hi @LeeTL1220, would you advise how I can find the variants that induce the failure? vcf file is relatively big (tens of MB), so it would be helpful if I can find the ones making it fail.
    Is CollectSequencingArtifactMetrics (generating preadapterdetailfile) not changed for a while, too? preadapterdetailfile could be generated by different build, I guess.

  • dayzcooldayzcool Member

    @shlee, Thanks for your kind explanation. I'll try to run with scattercount of 1 and report back or follow up with bug report.

  • dayzcooldayzcool Member

    @shlee and @LeeTL1220,
    I have uploaded the vcf and pre_adapter_detail_metrics files, so that you can reproduce the exception using the official build of gatk4 beta. File name is byoo_FilterByOrientationBias.zip
    Let me know if you have any issues

    Issue · Github
    by shlee

    Issue Number
    2294
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    sooheelee
  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @dayzcool Are you running the WDL? Regardless, I doubt this issue is being generated by CollectSequencingArtifactMetrics.

  • dayzcooldayzcool Member

    @LeeTL1220 I see. Yes, I ran the WDL. Please feel free to advise if you need any further information. Thank you for your help!

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    @dayzcool, I'm processing your bug report.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited July 2017

    Hi @dayzcool,

    I can recapitulate your error. It seems to be random, as you say, and suggests the cause may be some nondeterministic element of the tool code. I ran the command eleven times and five of the runs succeed and six fail with the error messages you report. I've asked a developer to fix this and you can follow the progress at https://github.com/broadinstitute/gatk/issues/3291.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    P.S. Our developer has just switched the order of FilterMutectCalls and FilterByOrientationBias in the WDL (see change at https://github.com/broadinstitute/gatk/pull/3289/files). They think that if you filter with FilterMutectCalls first then FilterByOrientationBias second, you will reduce the probability of encountering this random FilterByOrientationBias error. This PR has been merged in to master and so https://github.com/broadinstitute/gatk/blob/master/scripts/mutect2_wdl/mutect2.wdl reflects the change. Hopefully, this will enable you to continue with your research as we address the bug.

  • dayzcooldayzcool Member

    @shlee, thank you. I will use the modified wdl file. I didn't realize it has nondeterministic aspect.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    You're welcome. And many thanks to you for bringing this to our attention.

  • dayzcooldayzcool Member

    @shlee, thanks a lot for your prompt help! Midhat and @dayzcool

Sign In or Register to comment.