Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Unified Genotyper

AlexanderBAlexanderB Posts: 17Member

Hi GATK,

I've seen this error before on the forum associated with the Haplotyper tool, but not with the UnifiedGenotyper. Any thoughts on how I can fix this?

Best,

Alex

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: T at org.broadinstitute.variant.variantcontext.VariantContext.makeAlleles(VariantContext.java:1335) at org.broadinstitute.variant.variantcontext.VariantContext.(VariantContext.java:312) at org.broadinstitute.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:478) at org.broadinstitute.sting.gatk.walkers.genotyper.ConsensusAlleleCounter.consensusCountsToAlleles(ConsensusAlleleCounter.java:279) at org.broadinstitute.sting.gatk.walkers.genotyper.ConsensusAlleleCounter.computeConsensusAlleles(ConsensusAlleleCounter.java:103) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.computeConsensusAlleles(IndelGenotypeLikelihoodsCalculationModel.java:93) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.getInitialAlleleList(IndelGenotypeLikelihoodsCalculationModel.java:245) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.getLikelihoods(IndelGenotypeLikelihoodsCalculationModel.java:114) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoods(UnifiedGenotyperEngine.java:320) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoodsAndGenotypes(UnifiedGenotyperEngine.java:221) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:352) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:143) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:268) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:256) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:145) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:98) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-9-g532efad):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Duplicate allele added to VariantContext: T
ERROR ------------------------------------------------------------------------------------------

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Hi Alex,

    It's the first time we see it with UG too... can you please try running this analysis with the latest nightly build? We've made recent changes that apply to both HC an UG and may have fixed this. See Downloads page to access the nightlies.

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    I'm giving it a shot now, I'll post the results as soon as I get them. Thanks!

  • AlexanderBAlexanderB Posts: 17Member

    Same problem:

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: T at org.broadinstitute.variant.variantcontext.VariantContext.makeAlleles(VariantContext.java:1335) at org.broadinstitute.variant.variantcontext.VariantContext.(VariantContext.java:312) at org.broadinstitute.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:478) at org.broadinstitute.sting.gatk.walkers.genotyper.ConsensusAlleleCounter.consensusCountsToAlleles(ConsensusAlleleCounter.java:279) at org.broadinstitute.sting.gatk.walkers.genotyper.ConsensusAlleleCounter.computeConsensusAlleles(ConsensusAlleleCounter.java:103) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.computeConsensusAlleles(IndelGenotypeLikelihoodsCalculationModel.java:93) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.getInitialAlleleList(IndelGenotypeLikelihoodsCalculationModel.java:245) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.getLikelihoods(IndelGenotypeLikelihoodsCalculationModel.java:114) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoods(UnifiedGenotyperEngine.java:320) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoodsAndGenotypes(UnifiedGenotyperEngine.java:221) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:353) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:143) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:268) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:256) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:145) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:98) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version nightly-2013-04-10-gb25bc5b):
    ERROR
    ERROR Please check the documentation guide to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Duplicate allele added to VariantContext: T
    ERROR ------------------------------------------------------------------------------------------

    Here's the more expanded output:

    INFO 11:56:54,144 HelpFormatter - --------------------------------------------------------------------------------------------- INFO 11:56:54,148 HelpFormatter - The Genome Analysis Toolkit (GATK) vnightly-2013-04-10-gb25bc5b, Compiled 2013/04/10 00:01:08 INFO 11:56:54,149 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 11:56:54,149 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 11:56:54,152 HelpFormatter - Program Args: -R hs37d5.fa -T UnifiedGenotyper -L 21 -glm BOTH -nt 4 -ped -I /[a set of recalibrated/best practices BAMs ~150 nums].list -o [thevcfname].vcf INFO 11:56:54,152 HelpFormatter - Date/Time: 2013/04/10 11:56:54 INFO 11:56:54,153 HelpFormatter - --------------------------------------------------------------------------------------------- INFO 11:56:54,153 HelpFormatter - --------------------------------------------------------------------------------------------- INFO 11:56:55,047 GenomeAnalysisEngine - Strictness is SILENT INFO 11:56:55,223 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250 INFO 11:56:55,229 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 11:57:02,362 SAMDataSource$SAMReaders - Init 50 BAMs in last 7.13 s, 50 of 118 in 7.13 s / 0.12 m (7.01 tasks/s). 68 remaining with est. completion in 9.70 s / 0.16 m INFO 11:57:10,355 SAMDataSource$SAMReaders - Init 50 BAMs in last 7.99 s, 100 of 118 in 15.13 s / 0.25 m (6.61 tasks/s). 18 remaining with est. completion in 2.72 s / 0.05 m INFO 11:57:13,048 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 17.82 INFO 11:57:26,395 IntervalUtils - Processing 48129895 bp from intervals INFO 11:57:26,416 MicroScheduler - Running the GATK in parallel mode with 4 total threads, 1 CPU thread(s) for each of 4 data thread(s), of 16 processors available on this machine INFO 11:57:26,538 GenomeAnalysisEngine - Creating shard strategy for 118 BAM files INFO 12:01:34,216 GenomeAnalysisEngine - Done creating shard strategy INFO 12:01:34,216 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 12:01:34,217 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 12:01:34,578 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 12:01:44,623 SAMDataSource$SAMReaders - Init 50 BAMs in last 10.04 s, 50 of 118 in 10.04 s / 0.17 m (4.98 tasks/s). 68 remaining with est. completion in 13.66 s / 0.23 m INFO 12:01:55,246 SAMDataSource$SAMReaders - Init 50 BAMs in last 10.62 s, 100 of 118 in 20.67 s / 0.34 m (4.84 tasks/s). 18 remaining with est. completion in 3.72 s / 0.06 m INFO 12:01:59,507 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 24.93 INFO 12:01:59,552 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 12:02:04,229 ProgressMeter - 21:9416001 0.00e+00 30.0 s 49.6 w 19.6% 2.6 m 2.1 m INFO 12:02:11,100 SAMDataSource$SAMReaders - Init 50 BAMs in last 11.55 s, 50 of 118 in 11.55 s / 0.19 m (4.33 tasks/s). 68 remaining with est. completion in 15.70 s / 0.26 m INFO 12:02:22,015 SAMDataSource$SAMReaders - Init 50 BAMs in last 10.91 s, 100 of 118 in 22.46 s / 0.37 m (4.45 tasks/s). 18 remaining with est. completion in 4.04 s / 0.07 m INFO 12:02:25,904 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 26.35 INFO 12:02:25,935 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 12:02:34,178 SAMDataSource$SAMReaders - Init 50 BAMs in last 8.24 s, 50 of 118 in 8.24 s / 0.14 m (6.07 tasks/s). 68 remaining with est. completion in 11.21 s / 0.19 m INFO 12:02:34,238 ProgressMeter - 21:9459769 9.42e+06 60.0 s 6.0 s 19.7% 5.1 m 4.1 m INFO 12:02:39,592 SAMDataSource$SAMReaders - Init 50 BAMs in last 5.41 s, 100 of 118 in 13.66 s / 0.23 m (7.32 tasks/s). 18 remaining with est. completion in 2.46 s / 0.04 m INFO 12:02:41,481 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 15.55 INFO 12:03:04,247 ProgressMeter - 21:9765665 9.72e+06 90.0 s 9.0 s 20.3% 7.4 m 5.9 m INFO 12:03:34,264 ProgressMeter - 21:9907637 9.85e+06 120.0 s 12.0 s 20.6% 9.7 m 7.7 m INFO 12:04:04,274 ProgressMeter - 21:10085061 1.00e+07 2.5 m 14.0 s 21.0% 11.9 m 9.4 m INFO 12:04:34,284 ProgressMeter - 21:10933429 1.09e+07 3.0 m 16.0 s 22.7% 13.2 m 10.2 m INFO 12:05:04,293 ProgressMeter - 21:10935129 1.09e+07 3.5 m 19.0 s 22.7% 15.4 m 11.9 m INFO 12:05:34,303 ProgressMeter - 21:10941929 1.09e+07 4.0 m 22.0 s 22.7% 17.6 m 13.6 m WARN 12:06:03,450 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:10916534 has 9 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:06:04,312 ProgressMeter - 21:10944613 1.09e+07 4.5 m 24.0 s 22.7% 19.8 m 15.3 m INFO 12:06:34,322 ProgressMeter - 21:10968997 1.09e+07 5.0 m 27.0 s 22.8% 21.9 m 16.9 m INFO 12:07:04,331 ProgressMeter - 21:10996065 1.09e+07 5.5 m 30.0 s 22.8% 24.1 m 18.6 m WARN 12:07:12,035 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:10996309 has 7 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:07:34,341 ProgressMeter - 21:11026533 1.10e+07 6.0 m 32.0 s 22.9% 26.2 m 20.2 m INFO 12:08:04,349 ProgressMeter - 21:11087885 1.10e+07 6.5 m 35.0 s 23.0% 28.2 m 21.7 m INFO 12:08:34,359 ProgressMeter - 21:11114253 1.11e+07 7.0 m 37.0 s 23.1% 30.3 m 23.3 m INFO 12:09:04,367 ProgressMeter - 21:11154621 1.11e+07 7.5 m 40.0 s 23.2% 32.4 m 24.9 m INFO 12:09:34,377 ProgressMeter - 21:11169205 1.11e+07 8.0 m 43.0 s 23.2% 34.5 m 26.5 m WARN 12:09:59,754 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:11114453 has 7 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:10:04,386 ProgressMeter - 21:14410737 1.44e+07 8.5 m 35.0 s 29.9% 28.4 m 19.9 m INFO 12:10:34,396 ProgressMeter - 21:14415137 1.44e+07 9.0 m 37.0 s 30.0% 30.0 m 21.0 m INFO 12:11:04,472 ProgressMeter - 21:14439105 1.44e+07 9.5 m 39.0 s 30.0% 31.7 m 22.2 m INFO 12:11:34,482 ProgressMeter - 21:14439205 1.44e+07 10.0 m 41.0 s 30.0% 33.3 m 23.3 m INFO 12:12:04,512 ProgressMeter - 21:14546809 1.45e+07 10.5 m 43.0 s 30.2% 34.7 m 24.2 m INFO 12:12:34,522 ProgressMeter - 21:14771285 1.47e+07 11.0 m 44.0 s 30.7% 35.8 m 24.8 m INFO 12:13:04,531 ProgressMeter - 21:14771485 1.47e+07 11.5 m 46.0 s 30.7% 37.5 m 26.0 m INFO 12:13:34,541 ProgressMeter - 21:14897857 1.48e+07 12.0 m 48.0 s 31.0% 38.8 m 26.8 m INFO 12:14:04,550 ProgressMeter - 21:14916941 1.49e+07 12.5 m 50.0 s 31.0% 40.3 m 27.8 m INFO 12:14:34,560 ProgressMeter - 21:14953709 1.49e+07 13.0 m 52.0 s 31.1% 41.8 m 28.8 m INFO 12:15:04,570 ProgressMeter - 21:15195869 1.51e+07 13.5 m 53.0 s 31.6% 42.8 m 29.3 m INFO 12:15:34,580 ProgressMeter - 21:15277989 1.52e+07 14.0 m 55.0 s 31.7% 44.1 m 30.1 m INFO 12:16:04,589 ProgressMeter - 21:15320141 1.53e+07 14.5 m 56.0 s 31.8% 45.6 m 31.1 m INFO 12:16:34,599 ProgressMeter - 21:15321341 1.53e+07 15.0 m 58.0 s 31.8% 47.1 m 32.1 m INFO 12:17:04,608 ProgressMeter - 21:15324541 1.53e+07 15.5 m 60.0 s 31.8% 48.7 m 33.2 m INFO 12:17:34,618 ProgressMeter - 21:15338525 1.53e+07 16.0 m 62.0 s 31.9% 50.2 m 34.2 m WARN 12:17:42,095 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:15338549 has 8 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:18:04,627 ProgressMeter - 21:15340425 1.53e+07 16.5 m 64.0 s 31.9% 51.8 m 35.3 m INFO 12:18:34,637 ProgressMeter - 21:15373893 1.53e+07 17.0 m 66.0 s 31.9% 53.2 m 36.2 m INFO 12:19:04,646 ProgressMeter - 21:15377693 1.53e+07 17.5 m 68.0 s 32.0% 54.8 m 37.3 m INFO 12:19:34,654 ProgressMeter - 21:15408661 1.54e+07 18.0 m 70.0 s 32.0% 56.2 m 38.2 m INFO 12:20:04,664 ProgressMeter - 21:15535733 1.55e+07 18.5 m 71.0 s 32.3% 57.3 m 38.8 m INFO 12:20:34,672 ProgressMeter - 21:15645937 1.56e+07 19.0 m 73.0 s 32.5% 58.4 m 39.4 m WARN 12:20:50,640 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:15311685 has 9 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:21:04,682 ProgressMeter - 21:15745625 1.57e+07 19.5 m 74.0 s 32.7% 59.6 m 40.1 m WARN 12:21:21,364 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:15663777 has 12 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:21:34,691 ProgressMeter - 21:15858229 1.58e+07 20.0 m 76.0 s 32.9% 60.7 m 40.7 m INFO 12:22:04,701 ProgressMeter - 21:16189593 1.61e+07 20.5 m 76.0 s 33.6% 60.9 m 40.4 m INFO 12:22:34,711 ProgressMeter - 21:16335649 1.63e+07 21.0 m 77.0 s 33.9% 61.9 m 40.9 m INFO 12:23:04,721 ProgressMeter - 21:16682113 1.66e+07 21.5 m 77.0 s 34.7% 62.0 m 40.5 m INFO 12:23:34,731 ProgressMeter - 21:17150365 1.71e+07 22.0 m 77.0 s 35.6% 61.7 m 39.7 m WARN 12:23:58,283 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:15313811 has 8 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:24:04,740 ProgressMeter - 21:17177533 1.71e+07 22.5 m 78.0 s 35.7% 63.0 m 40.5 m INFO 12:24:34,751 ProgressMeter - 21:17214601 1.72e+07 23.0 m 80.0 s 35.8% 64.3 m 41.3 m INFO 12:25:04,760 ProgressMeter - 21:17467545 1.74e+07 23.5 m 80.0 s 36.3% 64.8 m 41.3 m INFO 12:25:34,770 ProgressMeter - 21:17899729 1.78e+07 24.0 m 80.0 s 37.2% 64.5 m 40.5 m INFO 12:26:04,779 ProgressMeter - 21:18265477 1.82e+07 24.5 m 80.0 s 38.0% 64.6 m 40.1 m WARN 12:26:21,583 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:15316084 has 8 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:26:34,789 ProgressMeter - 21:18890853 1.88e+07 25.0 m 79.0 s 39.2% 63.7 m 38.7 m INFO 12:27:04,798 ProgressMeter - 21:18965989 1.89e+07 25.5 m 80.0 s 39.4% 64.7 m 39.2 m INFO 12:27:34,808 ProgressMeter - 21:18966689 1.89e+07 26.0 m 82.0 s 39.4% 66.0 m 40.0 m INFO 12:28:04,817 ProgressMeter - 21:19158197 1.91e+07 26.5 m 83.0 s 39.8% 66.6 m 40.1 m INFO 12:28:34,827 ProgressMeter - 21:19205549 1.92e+07 27.0 m 84.0 s 39.9% 67.7 m 40.7 m WARN 12:28:42,212 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:15317306 has 9 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:29:04,835 ProgressMeter - 21:19232333 1.92e+07 27.5 m 86.0 s 40.0% 68.8 m 41.3 m WARN 12:29:10,556 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:19212718 has 8 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument WARN 12:29:25,281 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:19161570 has 11 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:29:34,844 ProgressMeter - 21:19647317 1.96e+07 28.0 m 85.0 s 40.8% 68.6 m 40.6 m INFO 12:30:04,853 ProgressMeter - 21:19685585 1.96e+07 28.5 m 87.0 s 40.9% 69.7 m 41.2 m INFO 12:30:34,863 ProgressMeter - 21:19716153 1.97e+07 29.0 m 88.0 s 41.0% 70.8 m 41.8 m WARN 12:31:02,068 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:19744415 has 9 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:31:04,872 ProgressMeter - 21:19775989 1.97e+07 29.5 m 89.0 s 41.1% 71.8 m 42.3 m INFO 12:31:34,882 ProgressMeter - 21:20410481 2.03e+07 30.0 m 88.0 s 42.4% 70.7 m 40.7 m INFO 12:32:04,891 ProgressMeter - 21:20617773 2.06e+07 30.5 m 89.0 s 42.8% 71.2 m 40.7 m INFO 12:32:34,900 ProgressMeter - 21:20763829 2.07e+07 31.0 m 89.0 s 43.1% 71.9 m 40.9 m INFO 12:33:04,910 ProgressMeter - 21:21132077 2.11e+07 31.5 m 89.0 s 43.9% 71.7 m 40.2 m INFO 12:33:34,919 ProgressMeter - 21:21745169 2.17e+07 32.0 m 88.0 s 45.2% 70.8 m 38.8 m INFO 12:34:04,929 ProgressMeter - 21:22177553 2.21e+07 32.5 m 88.0 s 46.1% 70.5 m 38.0 m INFO 12:34:34,938 ProgressMeter - 21:22332393 2.23e+07 33.0 m 88.0 s 46.4% 71.1 m 38.1 m INFO 12:35:04,948 ProgressMeter - 21:22559685 2.25e+07 33.5 m 89.0 s 46.9% 71.5 m 38.0 m INFO 12:35:34,957 ProgressMeter - 21:22696841 2.26e+07 34.0 m 90.0 s 47.2% 72.1 m 38.1 m INFO 12:36:04,967 ProgressMeter - 21:22898949 2.28e+07 34.5 m 90.0 s 47.6% 72.5 m 38.0 m INFO 12:36:34,976 ProgressMeter - 21:23658697 2.36e+07 35.0 m 88.0 s 49.2% 71.2 m 36.2 m INFO 12:37:04,987 ProgressMeter - 21:24779809 2.47e+07 35.5 m 86.0 s 51.5% 69.0 m 33.5 m INFO 12:37:34,996 ProgressMeter - 21:25434769 2.54e+07 36.0 m 85.0 s 52.8% 68.1 m 32.1 m INFO 12:38:05,005 ProgressMeter - 21:26377257 2.63e+07 36.5 m 83.0 s 54.8% 66.6 m 30.1 m INFO 12:38:35,016 ProgressMeter - 21:26941797 2.69e+07 37.0 m 82.0 s 56.0% 66.1 m 29.1 m INFO 12:39:05,025 ProgressMeter - 21:26971565 2.69e+07 37.5 m 83.0 s 56.0% 66.9 m 29.4 m WARN 12:39:35,004 DiploidExactAFCalc - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 21:26973663 has 9 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument INFO 12:39:35,035 ProgressMeter - 21:27074369 2.70e+07 38.0 m 84.0 s 56.3% 67.6 m 29.6 m INFO 12:40:05,043 ProgressMeter - 21:27117321 2.71e+07 38.5 m 85.0 s 56.3% 68.3 m 29.8 m INFO 12:40:35,053 ProgressMeter - 21:27220025 2.72e+07 39.0 m 86.0 s 56.6% 69.0 m 30.0 m INFO 12:41:05,062 ProgressMeter - 21:27365481 2.73e+07 39.5 m 86.0 s 56.9% 69.5 m 30.0 m INFO 12:41:35,071 ProgressMeter - 21:27647209 2.76e+07 40.0 m 87.0 s 57.4% 69.6 m 29.6 m INFO 12:42:05,081 ProgressMeter - 21:27918837 2.79e+07 40.5 m 87.0 s 58.0% 69.8 m 29.3 m INFO 12:42:35,090 ProgressMeter - 21:28214249 2.82e+07 41.0 m 87.0 s 58.6% 69.9 m 28.9 m INFO 12:43:05,099 ProgressMeter - 21:28290685 2.82e+07 41.5 m 88.0 s 58.8% 70.6 m 29.1 m INFO 12:43:35,109 ProgressMeter - 21:28291785 2.82e+07 42.0 m 89.0 s 58.8% 71.5 m 29.5 m INFO 12:44:05,118 ProgressMeter - 21:28295569 2.82e+07 42.5 m 90.0 s 58.8% 72.3 m 29.8 m INFO 12:44:35,128 ProgressMeter - 21:28296069 2.82e+07 43.0 m 91.0 s 58.8% 73.1 m 30.1 m INFO 12:45:05,137 ProgressMeter - 21:28327053 2.83e+07 43.5 m 92.0 s 58.9% 73.9 m 30.4 m INFO 12:45:35,147 ProgressMeter - 21:28565413 2.85e+07 44.0 m 92.0 s 59.4% 74.1 m 30.1 m INFO 12:45:41,842 SAMDataSource$SAMReaders - Initializing SAMRecords in serial

  • AlexanderBAlexanderB Posts: 17Member

    Sorry, that formatting got messed up, not intentional!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    OK, thanks for checking -- we'll look into this. We may need you to upload a bam file snippet for testing; if so I'll let you know.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Alright, we are going to need a test snippet for testing. Can you please submit a bug report as instructed in the following article? Thanks!

    http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    I've uploaded the bug report (under AJB_UG_error.tar.gz). Thanks for the assistance and hopefully it is an easy fix!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Thanks for uploading, Alex. We'll have a look at this asap. To that end, can you please post the command line that reproduces the error with the snippet?

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    gatk.sh -R hs37d5.fa -T UnifiedGenotyper -L 21:28560000-28700000 -glm BOTH -nt 4 -ped myPed.ped -I BAMSForGenotyping.list -o 21.raw_MasterGenotypes.vcf

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Hi Alex,

    I've finally had time to have a look at your files; we're going to need a few more things in order to reproduce your error:

    • your reference (unless it is one of the standard human genomes under another name)
    • your ped file if it is necessary to reproduce the error (I would expect not but you included it in the command line above)
    • the bam list file if all the files are needed to reproduce the error (if not, can you narrow it down to one or a few of them?)

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    i) The reference I've been using is hs37d5 which I can upload, but I'm pretty sure I downloaded it from a Broad server, so I bet you have a copy on hand. The .bam header might imply that this isn't the reference, but I can assure you it is.

    ii) The ped file isn't necessary to reproduce the error (I checked that myself). I can send it to you if you'd like, but I doubt it will do much good.

    iii) All of those bam files are included in my cohort. I haven't tried to narrow it down much more than that. If you'd like me to try to run them individually I can.

    Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Ah, right, we do have that reference, never mind.

    OK, I think we can work with this then. I'll let you know how it goes.

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    Hi Geraldine, just pinging you on how the UG debug is going. Any progress or anything I can help with?

  • AlexanderBAlexanderB Posts: 17Member
    edited April 2013

    Thank you very much! Look into things on our end as well so we don't throw any odd CIGARS into the pipeline from now on. Thank you for the phenomenal work! Our thoughts are with you and the rest of the Boston community.

    Post edited by AlexanderB on
  • AlexanderBAlexanderB Posts: 17Member

    Hey guys, I gave the nightly build from each of the past four or five days with the test chromosomes and still no dice. Should I keep trying every day or wait for the next official release? Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    I thought I saw the fix for this go into the master codebase, but let me look into it. I'll get back to you asap.

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    Running with -rf BadCigar did not actually work for me. I tried reprocessing my .bam's with a newer version of Picard and those too did not genotype correctly (the only problem with SAM validation I had was a missing NM tag which I thought was optional).

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Hmm, sorry to hear that. The NM tags are probably nothing, I wouldn't worry about them. Are you still getting the exact same error? Is it still in the same location as well in your reprocessed bams?

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    OK, I think I've got a fair idea on how the bug is occurring. When I split up my cohort of ~118 BAMs into 59 and 59 BAMs and run the genotyper twice in parallel, the error seems to be gone. When I try to genotype the whole cohort in total, I get the variant context added error (noted above). Could this be some form of overflow error? I also tried genotyping each person individually and that worked fine as well.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Ah, I guess you had two bugs and we only fixed one -- and apparently not the most important. Sorry about that. This is a weird one. Have you been able to determine with a certain number of bams, meaning if you take out bams progressively, is there a point where the error no longer occurs?

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    Well, would you like the same test set I had sent you before but without the bad cigar strings? The cigar strings for these guys are now flawless (no "0M" in the cigar string) so it should be easy to isolate the error.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Sure, let's do that. I'd like to get to the bottom of this.

    Geraldine Van der Auwera, PhD

  • AlexanderBAlexanderB Posts: 17Member

    I uploaded under UGErrorV2.tgz, but please ignore this for now. I want to try one more thing before I bother you guys : ).

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    OK, no worries -- I'm debugging someone else's issue right now :)

    Just let me know what you find and/or whether to go ahead with these.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.