Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

Error in VariantsToVCF

jlduanjlduan Posts: 2Member

Hi

I am trying to covert the UCSC format of SNP to VCF format. I downloaded dbSNP128.txt.gz from (UCSC http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/snp128.txt.gz). The command I used is java -jar GenomeAnalysisTK.jar -R mm9.fa -T VariantsToVCF --variant:OLDDBSNP dbSNP128.txt -o dbsnp128.vcf

Error message:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: G at org.broadinstitute.sting.utils.variantcontext.VariantContext.makeAlleles(VariantContext.java:1307) at org.broadinstitute.sting.utils.variantcontext.VariantContext.(VariantContext.java:290) at org.broadinstitute.sting.utils.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:495) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors$DBSnpAdaptor.convert(VariantContextAdaptors.java:206) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors.toVariantContext(VariantContextAdaptors.java:64) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.getVariantContexts(VariantsToVCF.java:177) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:123) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:83) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:248) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-4-g57ea19f):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Duplicate allele added to VariantContext: G
ERROR ------------------------------------------------------------------------------------------

According to this post (http://gatkforums.broadinstitute.org/discussion/1275/error-in-haplotype-caller), "Duplicate allele added to VariantContext" error may be caused by lower case bases in the reference. I converted all my reference sequences to upper case letters, but GATK still reports the same error message.

Thanks in advance.

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,286Administrator, GATK Developer admin

    Hi there,

    Could you please try again with the latest version (2.3-9) and let us know if the error persists?

    Geraldine Van der Auwera, PhD

  • jlduanjlduan Posts: 2Member

    Hi, I tried the latest 2.3-9. Still the same error.

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: G at org.broadinstitute.sting.utils.variantcontext.VariantContext.makeAlleles(VariantContext.java:1307) at org.broadinstitute.sting.utils.variantcontext.VariantContext.(VariantContext.java:290) at org.broadinstitute.sting.utils.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:495) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors$DBSnpAdaptor.convert(VariantContextAdaptors.java:206) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors.toVariantContext(VariantContextAdaptors.java:64) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.getVariantContexts(VariantsToVCF.java:177) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:123) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:83) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:248) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.3-9-ge5ebf34):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Duplicate allele added to VariantContext: G
    ERROR ------------------------------------------------------------------------------------------
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,286Administrator, GATK Developer admin

    OK, thanks for trying. We'll need you to narrow down the error to the offending record and upload a snippet to our FTP server so that we can reproduce the error and fix the bug. Please see this article on how to do that:

    http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • jlduanjlduan Posts: 2Member

    Hi

    Uploaded. File name is duanjl.bug.report.2013-1-16.tar.gz

    Thanks.

  • philliprichmondphilliprichmond Posts: 2Member

    I need to run the exact same command with dbSNP 132 for mm9. Has this fix been implemented in 2.3-9? If it isn't available til 2.4 then when will that be released?

    -Phil

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,286Administrator, GATK Developer admin

    We are finalizing the contents of 2.4 and are expecting to release next week if all goes well.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.