Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

VariantRecalibrator - Can't do comparison because Locatables' contigs not found in sequence dict

andresguarahinoandresguarahino Member
edited November 15 in Ask the GATK team

Hi,
sorry if the question isn't precise enough.

Here's the command:

tools/gatk-4.0.11.0/gatk VariantRecalibrator -R data/humanRefGenome/hs37d5.fa \
--variant data/mongolian/Mongolian_genome_sorted_noChr2.snp.vcf \
-O data/mongolian/mongolian.recal \
-tranches-file data/mongolian/mongolian.tranches \
-rscript-file data/mongolian/mongolian_plots.R \
--resource hapmap,known=false,training=true,truth=true,prior=15.0:data/mongolian/hapmap_3.3.hg19.sites_noChr2.vcf.gz \
--resource omni,known=false,training=true,truth=true,prior=12.0:data/mongolian/1000G_omni2.5.hg19.sites_noChr2.vcf.gz \
--resource 1000G,known=false,training=true,truth=false,prior=10.0:data/mongolian/1000G_phase1.snps.high_confidence.hg19.sites_noChr2.vcf.gz \
--resource dbsnp,known=true,training=false,truth=false,prior=2.0:data/mongolian/dbsnp_138.hg19_noChr2.vcf.gz \
-an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an DP \
-tranche 90.0 -tranche 99.0 -tranche 99.9 -tranche 100.0 \
-mode SNP

and that's the error:

21:10:45.445 INFO  VariantRecalibrator - Writing out recalibration table...
21:10:45.459 INFO  VariantRecalibrator - Shutting down engine
[November 15, 2018 9:10:45 PM CET] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 33.89 minutes.
Runtime.totalMemory()=2090336256
java.lang.IllegalArgumentException: Can't do comparison because Locatables' contigs not found in sequence dictionary
    at org.broadinstitute.hellbender.utils.IntervalUtils.compareContigs(IntervalUtils.java:149)
    at org.broadinstitute.hellbender.utils.IntervalUtils.compareLocatables(IntervalUtils.java:85)
    at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantDatum.lambda$getComparator$2(VariantDatum.java:49)
    at java.util.TimSort.binarySort(TimSort.java:296)
    at java.util.TimSort.sort(TimSort.java:239)
    at java.util.Arrays.sort(Arrays.java:1512)
    at java.util.ArrayList.sort(ArrayList.java:1462)
    at java.util.Collections.sort(Collections.java:175)
    at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantDataManager.writeOutRecalibrationTable(VariantDataManager.java:456)
    at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.onTraversalSuccess(VariantRecalibrator.java:695)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:968)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)

The output file (mongolian.recal) doesn't have any variant:
...
##contig=<ID=6,length=171115067,assembly=hs37d5.fa>
##contig=<ID=7,length=159138663,assembly=hs37d5.fa>
##contig=<ID=8,length=146364022,assembly=hs37d5.fa>
##contig=<ID=9,length=141213431,assembly=hs37d5.fa>
##contig=<ID=M,length=16571,assembly=hs37d5.fa>
##contig=<ID=X,length=155270560,assembly=hs37d5.fa>
##contig=<ID=Y,length=59373566,assembly=hs37d5.fa>
##reference=hs37d5
##source=VariantRecalibrator
#CHROM POS ID REF ALT QUAL FILTER INFO
END OF FILE (there isn't other rows).

What am I doing wrong?

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Hi @andresguarahino,

    My guess is that there is a mismatch between the contigs in the reference dictionary and contigs in the data. The mismatch could be in the presence/absence and/or the order in which the contigs are listed. For starters, you can try to regenerate your reference dictionary anew, with CreateSequenceDictionary. The next thing to try is to sort your data against the dictionary with SortVcf. Be sure to provide the tool the correct --SEQUENCE_DICTIONARY, otherwise, it will sort based on contig information in the header.

  • Thank you.

    I've created the dictionary and I've tried this command:

    java -jar tools/picard.jar SortVcf I=data/mongolian/Mongolian_genome_sorted_noChr2.snp.vcf O=data/mongolian/Mongolian_genome_sorted_noChr2_SortVcf.vcf SEQUENCE_DICTIONARY=data/humanRefGenome/hs37d5.dict

    which give me this error:

    Exception in thread "main" java.lang.IllegalArgumentException: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=10,length=135534747,dict_index=1,assembly=hg19) was found when SAMSequenceRecord(name=2,length=243199373,dict_index=1,assembly=null) was expected.
        at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:127)
        at picard.vcf.SortVcf.doWork(SortVcf.java:96)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
    Caused by: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=10,length=135534747,dict_index=1,assembly=hg19) was found when SAMSequenceRecord(name=2,length=243199373,dict_index=1,assembly=null) was expected.
        at htsjdk.samtools.SAMSequenceDictionary.assertSameDictionary(SAMSequenceDictionary.java:169)
        at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:125)
        ... 4 more
    

    Then I can't sort the VCF files. Am I missing something?

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    It appears @andresguarahino, that your data is based on a different reference than the one you are providing. You should use a reference that matches your data.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @andresguarahino How did this solution work out for you? This ticket will now be closed but as always, feel free to reach out again. You can respond here with follow-up questions to this issue.

Sign In or Register to comment.