Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

MergeBamAlignment Fails, can use AddOrReplaceReadGroups to debug?

concon PennsylvaniaMember

Hello,

I've been running a drop-seq experiment, and the 2nd to last command fails:

java -Xmx4000m -jar 3rdParty/picard/picard.jar MergeBamAlignment REFERENCE_SEQUENCE=mm10/mm10.fasta UNMAPPED_BAM=unaligned_mc_tagged_polyA_filtered.bam ALIGNED_BAM=aligned.sorted.bam INCLUDE_SECONDARY_ALIGNMENTS=false PAIRED_RUN=false OUTPUT=merged.bam

which generates this not-so-helpful error message

Exception in thread "main" java.lang.NullPointerException
at picard.sam.AbstractAlignmentMerger.createNewCigarIfMapsOffEndOfReference(AbstractAlignmentMerger.java:631)
at picard.sam.AbstractAlignmentMerger.createNewCigarsIfMapsOffEndOfReference(AbstractAlignmentMerger.java:654)
at picard.sam.AbstractAlignmentMerger.updateCigarForTrimmedOrClippedBases(AbstractAlignmentMerger.java:686)
at picard.sam.AbstractAlignmentMerger.transferAlignmentInfoToFragment(AbstractAlignmentMerger.java:514)
at picard.sam.AbstractAlignmentMerger.mergeAlignment(AbstractAlignmentMerger.java:410)
at picard.sam.SamAlignmentMerger.mergeAlignment(SamAlignmentMerger.java:138)
at picard.sam.MergeBamAlignment.doWork(MergeBamAlignment.java:248)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:206)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

which I thought that I had solved on http://gatkforums.broadinstitute.org/gatk/discussion/8627/picard-sam-mergebamalignment-fails but I didn't

This command requires two files, which I've examined with Picard's validateSamFile

for unaligned_mc_tagged_polyA_filtered.bam the error is

## HISTOGRAM    java.lang.String
Error Type    Count
ERROR:MISSING_PLATFORM_VALUE    1

and for aligned.sorted.bam, the error message is:

## HISTOGRAM    java.lang.String
Error Type    Count
ERROR:MISSING_READ_GROUP    1
WARNING:MISSING_TAG_NM    11720805
WARNING:RECORD_MISSING_READ_GROUP    11720805

which I think can be fixed by Picard's AddOrReplaceReadGroups (http://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups), but this command requires several options, which I don't know (the default options didn't work), i.e.

  RGID=4 \
  RGLB=lib1 \
  RGPL=illumina \
  RGPU=unit1 \
  RGSM=20

Illumina is obvious, but I don't know what to put for these options for a DropSeq experiment. How I can find these values for Picard's AddOrReplaceReadGroups?

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi again @con,

    For a description of read group fields and their implications for GATK tools, see this document and its discussion thread. You only need to add read groups to your unaligned BAM. MergeBamAlignment will apply the read group information from the unaligned BAM to the merged output.

Sign In or Register to comment.