ERROR MESSAGE: java.lang.Integer cannot be cast to java.lang.Double

Hi,

after running 5 exomes with GATK-v3.3 and HaplotypeCaller, I encountered a very low titv ration in my samples (~2.1) as VaraintEval report indicated. I tried running varaint filtration in these samples but I didn't see any imporvement in titv ratio nor any filtering done. therefore I filtered these with bcftools, after which the titv ratio improved to 2.5. Then when I tried running GenotypeGVCFs on these samples filtered with bcftools, I encountered the following error:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
at java.lang.Double.compareTo(Double.java:49)
at java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:290)
at java.util.ComparableTimSort.sort(ComparableTimSort.java:157)
at java.util.ComparableTimSort.sort(ComparableTimSort.java:146)
at java.util.Arrays.sort(Arrays.java:472)
at java.util.Collections.sort(Collections.java:155)
at org.broadinstitute.gatk.utils.MathUtils.median(MathUtils.java:999)
at org.broadinstitute.gatk.tools.walkers.variantutils.ReferenceConfidenceVariantContextMerger.combineAnnotationValues(ReferenceConfidenceVariantContextMerger.java:73)
at org.broadinstitute.gatk.tools.walkers.variantutils.ReferenceConfidenceVariantContextMerger.merge(ReferenceConfidenceVariantContextMerger.java:158)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:202)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:121)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:310)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version nightly-2014-11-17-g58cfab1):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: java.lang.Integer cannot be cast to java.lang.Double
ERROR ------------------------------------------------------------------------------------------

any advice on solving this incident will be much appreciated

Victoria

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Victoria,

    Unfortunately the GVCF tools currently don't work on BCFs. Are you running directly on BCFs or were the files converted?

  • nancySEEnancySEE malaysiaMember

    Hi Geraldine, I've faced the same problem too.
    However, my gvcf file generated by HaplotypeCaller, and further filter the DP using bcftools (because it didn't work well using neither selectVariants or VariantFiltration). I get the same error when i try to joint the filtered gvcf file using GenotypeGVCFs.

  • vifehevifehe SpainMember

    Hi @Geraldine_VdAuwera‌
    answering your question,

    Unfortunately the GVCF tools currently don't work on BCFs. Are you running directly on BCFs or were the files converted?

    I used the otion in bcftools that specifies outputs vcf file and yet got this error. Is there any tool in GATK to convert make sure vcf files are correctly formatted?

    thanks

  • vifehevifehe SpainMember

    Hi again,

    so I tried with GATKs tool VariantsToVCF, wtih command as follows:

    java -Xmx2g -jar $GATK-v3.3 -T VariantsToVCF -R $hg19 -o raw_variants_sample1_bcftoolfiltered-converted.vcf --variant raw_variants_sample1_bcftoolfiltered.vcf --dbsnp $dbsnp_141

    and then tried GenotypeGVCFs on these new vcf files, but again, I got the same error

    ERROR MESSAGE: java.lang.Integer cannot be cast to java.lang.Double

    any advice??

    thanks

    Victoria

  • vifehevifehe SpainMember

    Hi @Geraldine_VdAuwera‌ ,

    thanks for your answer, this actually explains why VariantEval or VQSR were working on these files touched by bcftools but not GenotypeGVCFs. However, I wonder whether you think this issue with files touched by bcftools will be solved in some near GATK release.
    I know I can filter the resulting file after GenotypeGVCFs, but I'll be joining my data with a collegue in a short future and we wanted to input GenotypeGVCFs with good quality raw_variants_*.g.vcf, and to do so I've just come up with bcftools providing a good solution.

    Thanks

    V

  • vifehevifehe SpainMember

    Hi @Geraldine_VdAuwera‌

    If you really want to do some pre-filtering, maybe I can help you get it done without bcftools. What kind of filtering are you trying to do exactly?

    I'm basically trying to filter out those variants with GQ<50 and DP<6. I tried VariantFiltration but aside from marking some variants with the corresponding filter, I didn't manage to create a new file with just the variants passing this quality treshold. Now I'm reading SelectVariants, which I'll try next. Am I in the right track?

    Thanks

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Generally speaking, that's right, SelectVariants can subset the passing variants. That's how those tools are intended to work; you tag variants with various filters using VariantFiltration, then you subset based on those tags (or PASS) with SelectVariants. Sorry if that's not clear in the docs, we'll try to improve that in future. That said, I wouldn't recommend ever filtering variants based on sample GQ. The raw GQs coming out of the caller are not always very accurate, hence the recently release genotype refinement workflow, for when you really care about genotyping accuracy (as in, determining if a variant shows up as het or hom-var in a particular sample). Anyway, GQ is not an indicator of the quality of the variant itself. For example, you can have a good quality variant (where you know that the sample is not hom-ref) for which you're not sure if it's het or hom-var. Great variant call, crappy GQ. So filtering on GQ might cause you to throw out good variants just because the sample genotype is not determined with certainty.

    But really, I would strongly recommend against filtering your GVCFs before merging. This is going to cause you to have missing records in your files, so once you genotype everything together you'll have no-calls for those samples if any of the other samples do have a variant at the same site. This completely undermines the purpose of the GVCF-based workflow. I understand why it would seem like a good idea, but please believe me, it's really not.

  • vifehevifehe SpainMember

    it is certainly right that the docs are a bit confusing, at least these two we are referring here, but with this new explanation you've given me clarifies pretty much their purpose, and I'll pass the conflict of pre-filtering based on QC to my colleagues.

    Thanks @Geraldine_VdAuwera‌ for your time and dedication, its certainly much appreciated.

  • pd3pd3 Member

    @Geraldine_VdAuwera This thread has been brought to my attention by the comment made in here https://github.com/samtools/htslib/pull/395#issuecomment-240126523. I want to clarify your statement above that "bcftools is changing type". That is not true. Although bcftools can output, say, "45" instead of "45.0", that is not a type change, both are valid floating point expressions.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    @pd3 Thanks for your comment. I'm afraid the JEXL interpreter we use to evaluate these expressions takes a stricter view on this point. That may be undesirable but I'm not sure there's anything we can do about it.
  • pd3pd3 Member

    I just got bitten by this bug myself while using a 3.x version of GATK. Here is a script which allows to work around this bug https://github.com/samtools/bcftools/blob/develop/misc/fix-broken-GATK-Double-vs-Integer

    @Geraldine_VdAuwera I understand the technical cause, but that doesn't make it less wrong. There are solutions, for example the exception could be caught and the type properly casted.

    I understand this is not an issue in GATK4 anymore. Thanks for fixing it.

  • @pd3 how can I use your github code. I am getting same error using GATK4

    java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.Integer
    at java.lang.Integer.compareTo(Integer.java:52)
    at java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:320)
    at java.util.ComparableTimSort.sort(ComparableTimSort.java:188)
    at java.util.Arrays.sort(Arrays.java:1312)
    at java.util.Arrays.sort(Arrays.java:1506)
    at java.util.ArrayList.sort(ArrayList.java:1454)
    at java.util.Collections.sort(Collections.java:141)

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @KG_Fun_Gen
    Hi,

    What is the command you ran and version of GATK4 you are using? Also, this thread may help.

    -Sheila

Sign In or Register to comment.