Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Adding readgroups with Picard

Hi,

Trying to run picard to add readgroups to my Bam file. Ran the command below:

java -jar picard.jar AddOrReplaceReadGroups INPUT=inputfile.bam OUTPUT=outputfile.bam RGID=H0164.2 RGLB= library1 RGPL=illumina RGPU=H0164ALXX140820.2 RGSM=sample1

I get the following output that contains the given errors:

Please help.

14:19:49.732 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/softw/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Mon Oct 01 14:19:49 EET 2018] AddOrReplaceReadGroups INPUT=LBK_Lipatov2014_DS2X.bam OUTPUT=LBK_Lipatov2014_DS2X_RG.bam RGID=H0164.2 RGLB=library1 RGPL=illumina RGPU=H0164ALXX140820.2 RGSM=sample1 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Mon Oct 01 14:19:49 EET 2018] Executing as [email protected] on Linux 4.4.0-124-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13; Deflater: Intel; Inflater: Intel; Picard version: 2.13.2-SNAPSHOT
INFO 2018-10-01 14:19:50 AddOrReplaceReadGroups Created read group ID=H0164.2 PL=illumina LB=library1 SM=sample1

[Mon Oct 01 14:19:50 EET 2018] picard.sam.AddOrReplaceReadGroups done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2058354688
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. Empty sequence dictionary.; File LBK_Lipatov2014_DS2X.bam; Line 1
Line: HISEQ:143:BD1HWMACXX:4:1104:8951:18415 16 1 9987 37 5M3I3M1D76M * 0 0 ACGTGTGCTCTTCCGATCACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC 4Y%%CBCFCFCCAFDBA>=HHHCGAHJGDB9IIIGICKJH?D7FIIFIFGIDEEAIJJBIEIKJEHELHKHIJIGJGF?EEEEC>?? X0:i:1 X1:i:0 FF:i:3 RG:Z:LBK XG:i:2 XM:i:3 XN:i:14 XO:i:2 XT:A:N NM:i:18 MD:Z:0N0N0N0N0N0N0N0N0^N0N0N0N0N0N1A69
at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:457)
at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:355)
at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:268)
at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:255)
at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:228)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
at picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:149)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @picarduser999

    This error appears because the appropriate header lines are not available in the SAM/BAM file. Header lines start with '@'. For Picard tools, they need to contain the Sequence dictionary information, such as the reference file etc. You need to use CreateSequenceDictionary.jar and cat to sort this out.

    1. Create the dictionary, say, dict.sam
      java -jar CreateSequenceDictionary.jar OUTPUT=dict.sam R=ref.fa

    2. Create a new file (unsorted_file.sam) that has both the dictionary and the aligned reads.
      cat dict.sam > unsorted_file.sam && cat file.sam >> unsorted_file.sam

    3. Sort the SAM file
      java -jar SortSam.jar INPUT=unsorted_file.sam OUTPUT=sorted_file.sam SO=coordinate

    Please let me know if this helps.

    Regards
    Bhanu

  • picarduser999picarduser999 Member
    edited October 2018
    1. Create a new file (unsorted_file.sam) that has both the dictionary and the aligned reads.
      cat dict.sam > unsorted_file.sam && cat file.sam >> unsorted_file.sam

    Here, what is file.sam? Is it the file to which I am trying to add the header?

    But that would be a bam file. When I try to convert it to sam it gives "no header" error...

    Post edited by picarduser999 on
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @picarduser999

    Yes it is the file you are trying to add header to.

    Would you please give me the exact command you are using to convert bam to sam and the exact error record.

    Regards
    Bhanu

Sign In or Register to comment.