Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How do I fix the issue "Sequence dictionaries are not the same size (6671, 242)"

I am using 2.18.7-1-gb02e42e-SNAPSHOT and Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)

I used the CreateSequenceDictionary on my .fasta genome:

java -Xmx2g -jar picard.jar \
CreateSequenceDictionary \
R=.fasta \
O=.dict \

This seemed to work, and made a .dict file. However, when I run CollectMultipleMetrics to get information on my RNAseq alignments done by STAR 2.5.0c:

java -Xmx2g -jar picard.jar \
CollectMultipleMetrics \
R=.fasta \
I=.bam \
O= \
PROGRAM=null \
PROGRAM=CollectAlignmentSummaryMetrics \
PROGRAM=QualityScoreDistribution \
PROGRAM=CollectGcBiasMetrics \
PROGRAM=MeanQualityByCycle \
PROGRAM=CollectInsertSizeMetrics \

There is an exception: "Exception in thread "main" htsjdk.samtools.util.SequenceUtil$SequenceListsDifferException: Sequence dictionaries are not the same size (6671, 242)"

I'd love any help on understanding the problem and how to fix it. Let me know if I can provide any other useful information.


Complete CollectMultipleMetrics output:

14:00:31.203 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file://sw/picard/build/libs/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Sun Jul 01 14:00:31 GMT-05:00 2018] CollectMultipleMetrics INPUT=.bam OUTPUT= PROGRAM=[CollectAlignmentSummaryMetrics, QualityScoreDistribution, CollectGcBiasMetrics, MeanQualityByCycle, CollectInsertSizeMetrics] REFERENCE_SEQUENCE=.fasta ASSUME_SORTED=true STOP_AFTER=0 METRIC_ACCUMULATION_LEVEL=[ALL_READS] INCLUDE_UNPAIRED=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Sun Jul 01 14:00:31 GMT-05:00 2018] Executing as on Linux 3.10.0-693.21.1.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.7-1-gb02e42e-SNAPSHOT
[Sun Jul 01 14:00:31 GMT-05:00 2018] picard.analysis.CollectMultipleMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2058354688
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.util.SequenceUtil$SequenceListsDifferException: Sequence dictionaries are not the same size (6671, 242)
at htsjdk.samtools.util.SequenceUtil.assertSequenceListsEqual(SequenceUtil.java:237)
at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:320)
at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:306)
at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:107)
at picard.analysis.CollectMultipleMetrics.doWork(CollectMultipleMetrics.java:426)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:282)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)


Tagged:

Best Answers

Answers

Sign In or Register to comment.