To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Picard Sort Vcf Error

Hello.

I am using GATK version 3.6, picard-2.8.2.jar

I downloaded hapmap_3.3.hg38.vcf from gatk resource bundle. I then used the below command to remove chr notation.
awk '{gsub(/^chr/,""); print}' hapmap_3.3.hg38.vcf > no_chr_hapmap_3.3.hg38.vcf.vcf

Before (hapmap_3.3.hg38.vcf)
chr1 2242065 rs263526 T C . PASS AC=724;AF=0.259;AN=2792
chr1 2242417 rs16824926 C . . PASS AN=530
chr1 2242880 rs11581436 A . . PASS AN=540

After (no_chr_hapmap_3.3.hg38.vcf.vcf)
1 6421563 rs4908891 G A . PASS AC=1086;AF=0.389;AN=2792
1 6421782 rs4908892 A G . PASS AC=1692;AF=0.606;AN=2792
1 6421856 rs12078257 T C . PASS AC=368;AF=0.132;AN=2790

Then, use Picard SortVcf to sort the no_chr_hapmap_3.3.hg38.vcf.vcf
java -jar picard-2.8.2.jar SortVcf I=removedChr_HapMap.vcf O=sortedHapMap.vcf SEQUENCE_DICTIONARY=hg38.dict

hg38.dict
@SQ SN:1 LN:248956422 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:2648ae1bacce4ec4b6cf337dcae37816
@SQ SN:10 LN:133797422 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:907112d17fcb73bcab1ed1c72b97ce68
@SQ SN:11 LN:135086622 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:1511375dc2dd1b633af8cf439ae90cec
@SQ SN:12 LN:133275309 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:e81e16d3f44337034695a29b97708fce

I have then encountered this error:

Exception in thread "main" java.lang.IllegalArgumentException: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chr1,length=248956422,dict_index=0,assembly=20) was found when SAMSequenceRecord(name=1,length=248956422,dict_index=0,assembly=null) was expected.
at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:126)
at picard.vcf.SortVcf.doWork(SortVcf.java:95)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)
Caused by: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chr1,length=248956422,dict_index=0,assembly=20) was found when SAMSequenceRecord(name=1,length=248956422,dict_index=0,assembly=null) was expected.
at htsjdk.samtools.SAMSequenceDictionary.assertSameDictionary(SAMSequenceDictionary.java:170)
at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:124)
... 4 more

I have tried a lot of times but still getting back the same error. Kindly do advise how can I solve this problem.

I would then like to perform SelectVariants to extract variants that missed in HapMap but present in my dataset.

Thank you so much in advance.

Cheers,
Moon

Tagged:

Issue · Github
by Sheila

Issue Number
1678
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Did you update the index file of your vcf after editing it to remove chr? If not, please do so and see if the error persists.
  • ymoonymoon malaysiaMember

    @Geraldine_VdAuwera Good day and thanks much for the suggestion.

    I have used IGVtools to index the removedChr_HapMap, generated new vcf.idx. However, the error persists.

    Exception in thread "main" java.lang.IllegalArgumentException: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chr1,length=248956422,dict_index=0,assembly=20) was found when SAMSequenceRecord(name=1,length=248956422,dict_index=0,assembly=null) was expected.
    at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:126)
    at picard.vcf.SortVcf.doWork(SortVcf.java:95)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)
    Caused by: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chr1,length=248956422,dict_index=0,assembly=20) was found when SAMSequenceRecord(name=1,length=248956422,dict_index=0,assembly=null) was expected.
    at htsjdk.samtools.SAMSequenceDictionary.assertSameDictionary(SAMSequenceDictionary.java:170)
    at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:124)
    ... 4 more
    Thank you.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Ah, the assembly property seems to be different. You'll need to edit it to match (replace null by 20 in the header).

  • rturbarturba Member
    edited February 13

    Hello there! I am also getting a similar error, though I don't know how to fix it. I got the VCF file from a collaborator and I'm running with a reference that I downloaded and indexed, but they are supposed to be exactly the same reference.

        Exception in thread "main" java.lang.IllegalArgumentException: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chrUn,length=62550211,dict_index=5,assembly=null) was found when SAMSequenceRecord(name=chrM,length=15742,dict_index=5,assembly=null) was expected.
                at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:127)
                at picard.vcf.SortVcf.doWork(SortVcf.java:96)
                at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
                at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
                at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)
        Caused by: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chrUn,length=62550211,dict_index=5,assembly=null) was found when SAMSequenceRecord(name=chrM,length=15742,dict_index=5,assembly=null) was expected.
                at htsjdk.samtools.SAMSequenceDictionary.assertSameDictionary(SAMSequenceDictionary.java:169)
                at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:125)
                ... 4 more
    

    I mean, isn't the purpose of the SortVcf to reorganize the VCF file according to the reference? Why is it complaining that my dictionaries are different? (I apologize in advance for the very newbie question).

    I was first trying to run SelectVariants in a vcf.gz file, and I got the following error:

        IndexDictionaryUtils - Track variant doesn't have a sequence dictionary built in, skipping dictionary validation!
    

    I decided to run again with an uncompressed VCF file to see if I still got the same error message, but I got a different warning:

            Input files /u/flashscratch/flashscratch2/r/rturba/stickleback/sorel_data/first12/./stickleback12.filtered.vcf and reference have incompatible contigs. Please see https://software.broadinstitute.org/gatk/documentation/article?id=63 for more information. Error details: The contig order in /u/flashscratch/flashscratch2/r/rturba/stickleback/sorel_data/first12/./stickleback12.filtered.vcf and reference is not the same; to fix this please see: (https://www.broadinstitute.org/gatk/guide/article?id=1328),  which describes reordering contigs in BAM and VCF files..
            ##### ERROR   /u/flashscratch/flashscratch2/r/rturba/stickleback/sorel_data/first12/./stickleback12.filtered.vcf contigs = [chrI, chrII, chrIII, chrIV, chrIX, chrUn, chrV, chrVI, chrVII, chrVIII, chrX, chrXI, chrXII, chrXIII, chrXIV, chrXIX, chrXV, chrXVI, chrXVII, chrXVIII, chrXX, chrXXI, chrM]
            ##### ERROR   reference contigs = [chrI, chrII, chrIII, chrIV, chrIX, chrM, chrUn, chrV, chrVI, chrVII, chrVIII, chrX, chrXI, chrXII, chrXIII, chrXIV, chrXIX, chrXV, chrXVI, chrXVII, chrXVIII, chrXX, chrXXI]
    

    Sorry this is so long, but I was trying to contextualize the problem, to see if maybe it was something I did wrong in the previous steps.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @rturba
    Hi,

    Are you using the latest version of Picard? Can you also try deleting the VCF index and re-generating it?

    Thanks,
    Sheila

  • I'm using Picard v.2.13.2. I tried creating a new index at IGV and running the uncompressed version of the file as well, and I still get the same type of error.

  • I downloaded the newer version and I'm still getting the same error :(

Sign In or Register to comment.