Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

MergeBamAlignment help

Hi all,

I am attempting to go through the Dropseq pipeline, but I have changed a few things to the default. I have aligned my fastq files to the hg38 genome rather than hg19 and I've also used TopHat to align rather than STAR. However, when I get the the MergeBamAlignment step, used to merge an unaligned bam and the aligned bam to re-introduce the tags into the aligned bam files, I keep getting an error but unsure how to resolve it.

Both the bam files are sorted by queryname as the pipeline says to do, but I keep getting the following error (I've removed some of the chromosome names otherwise it would have been too long, as it contains all the contigs):

Exception in thread "main" java.lang.IllegalArgumentException: Do not use this function to merge dictionaries with different sequences in them. Sequences must be in the same order as well. Found [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 3, 4, 5, 6, 7, 8, 9, ...].
        at htsjdk.samtools.SAMSequenceDictionary.mergeDictionaries(SAMSequenceDictionary.java:305)
        at picard.sam.SamAlignmentMerger.getDictionaryForMergedBam(SamAlignmentMerger.java:197)
        at picard.sam.AbstractAlignmentMerger.mergeAlignment(AbstractAlignmentMerger.java:346)
        at picard.sam.SamAlignmentMerger.mergeAlignment(SamAlignmentMerger.java:181)
        at picard.sam.MergeBamAlignment.doWork(MergeBamAlignment.java:282)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)

The picard code I'm using is:

picard MergeBamAlignment UNMAPPED_BAM=4571121.blue101/temp/unaligned_mc_tagged_polyA_filtered.bam ALIGNED_BAM=4571121.blue101/temp/aligned.sorted.bam OUTPUT=4565921.blue101/temp/merged.bam REFERENCE_SEQUENCE=/scratch/ea11g10/Dropseq/hg38.fasta PAIRED_RUN=false INCLUDE_SECONDARY_ALIGNMENTS=false    CLIP_ADAPTERS=true IS_BISULFITE_SEQUENCE=false ALIGNED_READS_ONLY=false MAX_INSERTIONS_OR_DELETIONS=1 READ1_TRIM=0 READ2_TRIM=0 ALIGNER_PROPER_PAIR_FLAGS=false SORT_ORDER=coordinate PRIMARY_ALIGNMENT_STRATEGY=BestMapq CLIP_OVERLAPPING_READS=true ADD_MATE_CIGAR=true VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LE
VEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json

I created the dict file for the hg38.fasta file using picard CreateSequenceDictionary

I am using picard version 2.8.3 and the java version is 1.8.9_51.

Any help would be appreciated. Thanks

Answers

Sign In or Register to comment.