To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Potential ploidy problem in GenotypeGVCFS

grafalgrafal MPI TuebingenMember

Dear Geraldine, Sheila and GATK community,

  • I am trying to run GenotypeGVCFs on set of 90 individuals; they range in ploidy (between 2 and 5).

  • I didn't do MQ filtering on my bams (I have fairly short reads and messy reference), I had hoped that I could do MQ filtering on variants instead.

When I tried to run GenotypeGVCFs at first, I had warnings that I exceeded PLs and program crashed. So I increased the maximum allowed PLs to 300. I have no more PLs warning but the error that crashes my genotyping is still there (please see below).

I am not even sure where to start with that so please point me in the right direction :)

INFO 21:05:52,100 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:05:52,104 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
INFO 21:05:52,104 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO 21:05:52,105 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
INFO 21:05:52,105 HelpFormatter - [Fri Nov 10 21:05:52 MET 2017] Executing on Linux 4.4.37-00001-g8551896 amd64
INFO 21:05:52,106 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14
INFO 21:05:52,112 HelpFormatter - Program Args: -T GenotypeGVCFs -R /ebio/abt6_projects8/potato_var/data/reference/Stub_ChrChl_v4.fa -V gvcfs.list -o herb_stub.vcf -nt 24 --max_num_PL_values 300
INFO 21:05:52,121 HelpFormatter - Executing as rgutaker@burrito.eb.local on Linux 4.4.37-00001-g8551896 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14.
INFO 21:05:52,122 HelpFormatter - Date/Time: 2017/11/10 21:05:52
INFO 21:05:52,123 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:05:52,123 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:05:52,252 GenomeAnalysisEngine - Strictness is SILENT
INFO 21:05:52,401 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 21:05:52,933 MicroScheduler - Running the GATK in parallel mode with 24 total threads, 1 CPU thread(s) for each of 24 data thread(s), of 64 processors available on this machine
INFO 21:05:53,133 GenomeAnalysisEngine - Preparing for traversal
INFO 21:05:53,146 GenomeAnalysisEngine - Done preparing for traversal
INFO 21:05:53,147 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 21:05:53,148 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 21:05:53,148 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
WARN 21:05:53,700 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
WARN 21:05:53,702 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
INFO 21:05:53,702 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files
WARN 21:05:55,015 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs
INFO 21:06:23,161 ProgressMeter - ST4.03ch01:23021901 0.0 30.0 s 49.6 w 3.2% 15.7 m 15.2 m
WARN 21:06:26,298 ExactAFCalculator - This tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at ST4.03ch01: 9011866 has 7 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument. Unless the DEBUG logging level is used, this warning message is output just once per run and further warnings are suppressed.
INFO 21:06:53,197 ProgressMeter - ST4.03ch01:23114301 0.0 60.0 s 99.3 w 3.2% 31.4 m 30.4 m

ERROR --
ERROR stack trace

java.util.ConcurrentModificationException
at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
at java.util.LinkedList$ListItr.next(LinkedList.java:888)
at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.coveredByDeletion(GenotypingEngine.java:426)
at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateOutputAlleleSubset(GenotypingEngine.java:387)
at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:251)
at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:392)
at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:375)
at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:330)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:326)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:304)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:135)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://software.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • grafalgrafal MPI TuebingenMember

    Hi Again,

    While waiting for a rescue mission, I re-made all g.vcfs this time pretending they are all diploid and while runnung GenotypeGVCF, I got errors again, slightly different this time:

    ERROR --
    ERROR stack trace

    java.lang.NullPointerException
    at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.coveredByDeletion(GenotypingEngine.java:427)
    at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateOutputAlleleSubset(GenotypingEngine.java:387)
    at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:251)
    at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:392)
    at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:375)
    at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:330)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:326)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:304)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:135)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
    at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to

    Many thanks for your help.
    Rafal

Sign In or Register to comment.