The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks (  ) each to make a code block as demonstrated here.

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# Error running CollectAlignmentSummaryMetrics on a bam generated from .maf file

Member Posts: 5

Hello,
Recently I run an alignment with LAST tool (http://last.cbrc.jp/ - fasta aligner for long reads alignment), it produces .maf file which I then converted to sam(with http://last.cbrc.jp/doc/maf-convert.html) then to bam (with picard). Until now everything looks fine, next I try to run picard CollectAlignmentSummaryMetrics and it throws this error:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.collectQualityData(AlignmentSummaryMetricsCollector.java:329)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.addRecord(AlignmentSummaryMetricsCollector.java:195)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:127) at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:93)
at picard.metrics.MultiLevelCollectorAllReadsDistributor.acceptRecord(MultiLevelCollector.java:192) at picard.metrics.MultiLevelCollector.acceptRecord(MultiLevelCollector.java:315) at picard.analysis.AlignmentSummaryMetricsCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:89) at picard.analysis.CollectAlignmentSummaryMetrics.acceptRead(CollectAlignmentSummaryMetrics.java:147) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:138) at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:77) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:208) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105) I am adding the head of the bam file: 0034a196-edbc-429f-89c4-b5280a486760_Basecall_2D_2d 0 burn-in 1 100 4H21=1D18=.....1D17=2D1X6=6D1X11=2I1X31=19H * 0 0 GGGCGGCGACCTCGCGGGT.....AGCATGCCACG * NM:i:152 AS:i:10909 06c0ff36-09df-4bb3-b952-146fca6f60ae_Basecall_2D_2d 0 burn-in 1 100 8H21=1D3=......2D1=1I57=2D68=1D1=2D42=29H * 0 0 GGGCGGCGACCTCGCGGG...........GCAAGCGTGA * NM:i:402 AS:i:33419 I deleted values in the middle of SEQ and CIGAR strings because they are very long. Running ValidateSamFile on this bam file shows not relevant problem: ## HISTOGRAM java.lang.String Error Type Count ERROR:MISSING_READ_GROUP 1 WARNING:RECORD_MISSING_READ_GROUP 2441 For the same sequencing run I had fastq files which I aligned with bwa and when I run CollectAlignmentSummaryMetrics on the bam file from this workflow it worked fine. here is a head of the bam from this workflow (alignment with bwa using fastq): 0034a196-edbc-429f-89c4-b5280a486760_Basecall_2D_2d 0 burn-in 1 60 4S18M1D1....M6D32M19S * 0 0 TGCTGG...TGTTTGA /)6-,(-.../9/)0,*, MD:Z:18^T..A11G31 NM:i:138 AS:i:1920 XS:i:0 06c0ff36-09df-4bb3-b952-146fca6f60ae_Basecall_2D_2d 0 burn-in 1 60 8S18M1D1...D1M2D42M29S * 0 0 GTATTGC...ATGTGTTTC =.01-)**)./....'-.+*+ MD:Z:18^.^A1^AA42 NM:i:371 AS:i:5836 XS:i:0 Same as before, I removed the characters in the middle of the long strings. Hope you could help me with my problems. Thanks and have a great day. Tagged: ## Best Answer ## Answers • Cambridge, MAMember, Administrator, Broadie Posts: 11,647 admin The ArrayIndexOutOfBoundsException error suggests that you may have some malformed reads where the alignment information does not make sense, e.g. Maps off the end of a contig or something like that. That could be a bug in the aligner you're using. This seems especially likely considering the BWA alignment appears to be healthy. Geraldine Van der Auwera, PhD • Member Posts: 5 edited December 2016 Well it seems that the only difference between the two bam files (one from aligning with fastq and one from aligning with fasta, two example sequences from the files are posted in the first post) is that in one file there is a phred score and in the other file there is a single "*" in that place. I'm trying to figure how to work with that but if anybody have a suggestion I will try it. BTW, I'm using the latest version of picard (2.8.1) • CambridgeMember, Broadie, Moderator Posts: 524 admin Hi @SDFfASF, If what you post is indeed the top of the BAM file, then you are missing an actual header. Also, your error messages are saying that the BAM is missing read group information, which also indicates a missing header. To start, take a look at this FAQ to see what a BAM header should look like. To add such a header, e.g. you can use Picard's ReplaceSamHeader. • Cambridge, MAMember, Administrator, Broadie Posts: 11,647 admin I'm pretty sure the problem is because of that asterisk. What program generated your alignments? Geraldine Van der Auwera, PhD • Member Posts: 5 @shlee It had a header I just didn't post it by mistake. I will read the faq for sure, thanks. @Geraldine_VdAuwera I'm pretty sure too and when I put a some phred values instead of this asterisk picard worked fine. But I thought i saw somewhere in the documentation that picard did not requier qscore in the bam/sam and could work with files where it's replaced with a "*". The tool was last, i linked to it in the first post. #### Issue · Github January 12 by Sheila Issue Number 1625 State closed Last Updated Assignee Array Milestone Array Closed By vdauwera • CambridgeMember, Broadie, Moderator Posts: 524 admin edited January 13 Thanks for the feedback. I examined your two sets of records carefully and notice one interesting difference. The first set (that gives you problems with CollectAlignmentSummaryMetrics) uses extended CIGAR nomenclature (1D17=2D1X6=6D1X11=2I1X31=19H), while the second set (that works fine) does not (M6D32M19S). Would it be possible for you to attach a file of 100 such extended CIGAR SAM records in a valid BAM file, i.e. with header, so that we can test whether this is the problem or if something else is causing the issue? Can you make sure this snippet still gives you the error before attaching it here in this thread? Thanks. • Member Posts: 5 edited January 15 Lately, I'm using another tool that had similar problems with it's output with picard. Though it didn't use extended CIGAR and still had the problem with the asterisk, so I will post some test file snippets from this tool's output. I attached files which were originally .sam files but I changed it to .txt in order to upload. Important to note: picard throws "Error parsing SAM header. @RG line missing SM tag" with those files but I read that this SM tag is not essential for picard and u can ignore this error by adding "VALIDATION_STRINGENCY=SILENT" which I did. First file [testWithAsterisk.txt] - Original sam file which had asterisk instead of qscore: @HD VN:1.0 SO:unsorted @SQ SN:burn-in LN:48502 @RG ID:1 @PG ID:6 PN:minialign 8915e658-528c-4677-88a8-c2eba6c58fc5_Basecall_2D_2d 16 burn-in 41435 60 31M4D6M2I4M2D11M1D4M1D33M1I1M2D13M1D26M3D4M1D61M2D13M1D11M2D3M1D13M1D1M1D7M1D15M1D10M1D2M2D6M1I3M1D4M1D22M1D31M1D3M1D7M5D10M2D22M1I2M1D16M1I14M1I17M2D12M2I7M1D36M2I2M1D19M1D35M1D39M1D21M1D5M1I78M3D14M2I13M1D8M2D18M2D78M1I11M1I3M2D1M2D2M1I5M1I8M1I36M1D12M1D19M2D10M1I1M1I13M3D7M1D20M1D53M1I27M1I37M1I9M2D139M1I7M2D22M1I19M4D50M1D108M2D26M1I37M2I34M2I9M1I6M1D62M2I11M1I28M1I5M2D49M1D25M1D19M1I10M1D57M1D15M1D17M2I8M1D34M1D15M2D12M1D56M2D9M2I3M4D27M3D12M1D10M1D59M1D29M1D16M2D72M3D8M1D77M1D8M6S * 0 0 CCCTTCACCAAATACTGTGATGAATATATCAAGGGAAAATTACCACGTGGATTGCATCGAGCCGATAAACTGAAGCGGCTAAAGCCAAAGCACGAATCAGATATCTGAAGAACTGTCAGACTTTGAGAAGGATATCTCGCATGGTGGAAGCAATAACCATTCGATTTGCAAATACCGGAACATCTCGGTAACTGCATTCTGCATTAAAAATCAACGCAAAATCGACTTGCCTGCAAAAGAGGAGGATTGCAGCGTGTTTTAATGAGGTCACAGGATCCGCAATGCGGACGGACATCGGGAAACGCCAAGGAGATTATGTACCGAGGAAGAATGTCGCTGACGTATCGCGGTATTCAGAATGATTATCAAGCCCTGTATCAGAGAAGGGTACGAGCTAAAAAGATTCGATACTGGTATTTTGTTCTGAGTCATGAAATACTTGGAGAGGGCAGCTGATTTTGACTTCGGGAGGGAAGCTGCATGATGGGATAAGCATCGGTGCGGTGAATGCAAGAAGATAACCGCTTCCGACCCAATCAACCTTACTGAATCGATGGGGTCTCCGGTGTGAAAGAACACCAACAGGGTGTTACCACTACCGCAGGAAAGGAGGGACGTGTGGCGAGACAGCGACGAAGTATCACCGACATAATCTGCGAAAACTGCAAATACCTTCCAACGAAACGCACCAGTAAACCCAAGCCAACTTGCAAAAAGAATCGACGTAAACCTTCAACTACACGGCTCCTGTGGGATATCCAGTGGCTAAGACGTCGTGCGAGGAAAACAAGGTGATTGACCAAAATCGAAGTTACGAACAAGAAAAGCGTTGAGCAAAGCTAGTCGCGCTTAACTGCGTATTAAAAGCTGCATGTGCTGGAAGTTCACGTGTGTAACACTGCTGCGGAAACTGATGAGCGATCCGAAGCCTGATGCATCAGAGGAAGAAGATGGATAAACAGCGCGAAGACGATGTAAAACGATGAATGCCGGGAATGGTTTCACCCTGCATTCGCTAATCAGTGGTGTTAATACTCCAGAGTGTGGAACCAAGATAGCACCTCGAACGACGAAGTAAAGAACGCGAAAAAGCGGAAAACAGTAGCAGAAGAAACGACGACGAGAGGAGCAGAAACAGAAAGATAAACTTAAGATTCGAAAACTCGCCTTAAAGCCCCGCAGTTACTGGATTAAATAAGCCCAACAAGCTGCAAACGCCTTCATCAGAGAAAGAGACCGCGACTTTCCCATGTATCGTGCGGAACGCTCACGTCTGTTTCAGTGGGATGCCGGACATTGACAACTGCTGCGGCACCTCAACTCCGATTTAATGAACGCAATATTCACAGCAATGCGTGGTGTGCAACCAGCACAAAAGCGGAAATCTCGTTCCGTATCGCGTCGAACTGATTAGCCGCATCGGGCAGGAAGCAGTAGACGAAATCGAATCAAACCAACCGCCATCGCTGGACTATCGAAGAGGTGCAAGGCGATCAAGGCAGAGTACCAACAGAAACTCATAAAGACCTGCGAAATAGTAGAAGTGAGGCCGCATGGGACGTTCTCTTGTAAAACCATTCCAGACATGCTCGTTGAAACATACGGAAATCAGACAGAAGTAGCACGCGGACTGAAAGTTGTAGTCGCGGGTACGGTCAGAAAATACGTTGATGATAAAAGATGGAAATGCACGCCATCGTCAACGACGTTCTCATGGTTCATCGCGGATGGAGGAAAGAGATGCGCTATTACGAAAAATTGATGGCGGCAAATACCGGAAATATTTGGTAGTTAAGGATCTGCACGGATGCTACACGAACCTGATGAACAAACTGGATACGATTGATTCGACAACAAAAAGACCTGCTTATCTCGGTAAGGACGGCTGGTTGATCGTGGTGTAGAGAACGTTGAATGTTTGAATTAATCACATTCCTGGTTCAGAGCTTGCATGGAAACCATGAGCAAATGATGATTGATGGCTTATCAGAGCGTGGAAACGTGTCACTGGCTTAGCTGGCGGTGGCTGGTTCTTTAATCTCGATGACAAAGAAATTTGGCTAAAGCCTTGCCCATAAAGCAGATGAACTTCCGTTAATCATCGAACTGGTGAGCAAAGATAAAAATATGTTATCTGCCACGCCGATTATCCCTTGACGAATACGAGTTTGAAGCCAGTTGATCATCAGCAGGTAATCTGGAACCGCGAACGAATCAGCAACTCGCGCCGTGGGATCGTGAAAATCAAAGTGCGGACACGTTCATCTTTGGTCATACGCCAGCAGTGAAACCACTCAAGTTTGCCAACCAAATGCATATCGATACCATGCAGTGTTGCAAAA * RG:Z:1 8915e658-528c-4677-88a8-c2eba6c58fc5_Basecall_2D_template 4 * 0 0 * * 0 0 TTGGCAGATAACATATTTTATCTTTTGCTCACCAGTTCGATGATTAACGGAAGTTCATCTGCTTTATGGG * RG:Z:1 8da715a9-3717-4f04-9667-e7e0c2792104_Basecall_2D_2d 16 burn-in 9431 60 27M4D17M2D15M1D5M1D7M1D9M1I27M1D17M2D8M6I7M3D2M1D1M1D9M2D27M4D9M3D18M2D18M6D15M1D36M1I9M1D4M1I1M3D20M2D16M2D3M1D7M1D8M1D7M3D4M1D11M1D24M1D37M2D9M1D15M1I16M1I9M1I9M1D29M2D5M1D3M1D11M2I5M3D1M1D4M6D5M2I4M4I5M1I10M1D8M2D6M2I16M1D18M5D4M5I4M1D22M2D2M1D6M1D5M1D9M1I4M1D2M2D15M2D6M1I1M3I6M3D11M1D16M1D23M5D9M4D3M2D1M1D3M1I8M2D15M1D8M3I14M1D7M1D1M1D7M1D9M5D2M2D4M1D9M1D3M1D24M2D11M1D7M2D26M6D7M1D3M1D6M1I14M3D12M3D3M1I19M1D9M1D7M1D4M1I1M3D29M1I3M1D13M2D1M1D23M2D18M1D10M1D7M2D13M1D4M2D3M1D7M1D8M1D11M1D4M2D2M3I3M1D17M3I3M1I5M3D12M1D4M2D15M1D7M2D3M3D8M2D4M4D8M1D11M1D18M1D27M2I23M4I1M1I5M2D3M3D23M1D53M2D3M1D3M4D8M2D13M1D33M4D5M1I7M1D2M1D2M1I5M2D7M1D15M1D8M2D9M1I17M4D12M2D20M1I7M2I16M1D1M1D2M1D5M1D2M1I9M2I14M3I4M1D6M4D4M5I3M1D14M2D10M1D1M2D19M2D6M1D15M1D23M6D4M2I5M1I1M2D15M1D10M2I3M1D3M1D54M1D11M1D44M2D3M3D18M1D25M1I9M1D5M1I4M2D3M2D17M1D10M3D9M1I13M2D18M2D2M1I15M1D3M1I9M5D4M1D2M1I9M2I1M1I2M1I5M1I25M2D8M1D28M1D14M1I6M2I7M1I21M4D33M1D3M1D1M4D4M1D3M2D18M1D4M1D3M1D11M1D1M2D5M2D7M2D21M2D3M5I4M2D7M2D7M1D71M1I14M4D5M1I9M1D23M1D13M2D22M1D38M3I7M1I8M1I13M1I2M3I14M3D7M2I36M4D13M1I9M1D7M1D4M1D6M1D4M1D8M1D2M1D2M1I10M2D5M1D13M1D28M4D5M2D18M3I5M4I6M4I3M2I10M1D9M1D16M1I4M2D5M6S * 0 0 AGCGGCAACCGGCATGACCGTGACGCCAGCACCTCGGTGGTGAAGGCAGAGTACCACGCGACGGGGCCTTCAGCCGGAGGCGCGTAAACGACAAGAGCTTTCGTGCAGGTCTGCGGACAAAACAAGCCACCGTCGGCTTGTCGGTCGGAGACTATCACTGAACGGCGTTGCTGCAGGCCGGGTTATCCGGTGCTCGGTAATGGTGAGTTTGCCGGTTGCAGAAATTACCGCCAGTTAATCCGGAGGTCGACGATGTTCTTAGAAACCGAATCATTTGAACACTAACGTGGTACTGCTGCTTTCTGAACTGTCAGCCCCAGTGCATTGAGCATCGCCTGATGAACGGCGTGCGAACAGGAGTCGACAGCAACCGAAGTTTGCTGTGGAAGATGCCATCGAACCGGCGCGTTTCTGGTGGCGATGTCCCTGTGGCAACCATCCGCGAAGACGCAGATGCCAGTCCATGAATGAAGCCAGTTAAACAGGATTGAGCAGAAGTGCTACCCACCTGGCCGGCGCGCGCATTCACTACGAAAACGTAAAACGCACCTTCTGATGCGCCTTCGCGGATGGTGGAATAATGCCTTGAACAGAGAGGACACGCCGGGCCCGCAAAGCTGAAGCTGCGGGAAAGTGCGGTCATGCGAGCGAGTTTCGCCCTGAAACTGGCGTGGATGGGCGACCGACTGAAAAAGCCAGCGGCTGGAGGTCATCCGGAGTGGTACCGCCGACCACCGCTTTTGAGCACCCATTATTTTCTGATGTTCTGCTGGATATGCACTGGGCTGACGCCGCCATACCTGTTTTAGCGATCCGGATATGATCCGCCGACGGATTTCAGTCTGCCAATGGGCCAGGCTGAGAAGGTTCTTGGGGCGATGCTGCAGCGCAGGGCTTGCCGGAGGTGTCCGCCGGCCCGGACGAAATGAAGACCCCCGCTTCCCAGGATGTGGCGAAGGAGGACACGAATGCCAGATGACAGCATCATGGACTGCAGGAGTCCAGGTATGGCTGGTCCGGTAGCGATTTAATGTTGATTGAGTTCACGCGCTTGGATTTGACGAACAGATGGCGTAGATCAGGCGTCATTTCGGTACGGAAAGTGATGCGAAAAACAGCGGCAGTCGTTGACCGTCGCCGAGCGACAGGCTGGCTGCACAGAAGCGGATTCCGTCGGCGGTAGAAGCCGCCATGCGAGGCCCGGGTGCCGGTTCACCGACGTGGCTGGCCCTGCAGCCAGGGTGGCAAATCCGGCTGATCCTGCCGCACAGGGGGCAGAAGGACTCTCAAATGATCCCATGTTCAGGGGCTTGCCGGTGCGATCACCTGCCGATGGTGGGGGCCACCTCGCTGGTGCGGTGGCGACCGGTGCGCTGGCGGCGCTCATGCCGTAGGGCAACTCAACCCCGTCCGATTCAACAAAACGCTGGTCCTTTCCGGCAATCAGGCGGGACTGACGGCAGATCGTACTGTCCCAGAGCCGCAGGCGGCAGGGCGACGTTTAACCAGACCAGCGAGTCACTCAGCGCGTTAGCGGCGGGGTAGCAGGTGACTCGGGTGCGTCCATCAGCGAGGTGTGGCGTTTCTCTTCCTGCATCCGGCGTGGAGGCAAGGTCGCTGACCTTCCGGAAGCTGACCAGAAGACCCGAGCCGTCAGGGCTGACGGCAGGTCGCGGTGTCCGCAAGGCATGTCGGCGGAGCAGCGGACAGGTATGTCAGAAGCGGTGCGCGTTCCAGCGATGGCCGGGCGATGGGCGGCAGCCGAGGCCGCAGTCAGGTTTGATGACCAGACCGCCGCCTGAAAAAGAACATGGGCGAAAAACCTGGTGGACAGGACTGCGCGGCATTCAAATCAGCAATGGATGCGGTGCTGGATATTGGTCGTCCTGATACCGCGCAGGAGATGCTGATTACGTAGAGGCTGCGGTAAGAAAGCAGACGACATCTGGAATCTGCGCAAGGATGATTATTGTTGATGAAGCGCGGGCGCGTACTGGGATAATCGTGAAAAGGCCCGGTCTTGTGCCGAAGTTCGCCAAAAGGCTGAGCAGCAGACTACCAGGACAATGCGCCGTATGGTGAGCAATACCAGCGTCACGGCTGAAACACCAGAAAGGGCGCAAAAGCTACACGAACAGCACGGGCATGCCGAAAGCGAGTAGCCGCCGCATCAGGAAGAACTGAACAAAACACAAAGACGGAAAATCCTGCAGGCGGATTACAACGCGCGATGGCAGAAGTAAGAAAAGGCGATTATGGCAGCGAGCCGGCGTCGGCTGAATCCAGCGTGAAGGTGTCTGCGGGCGATCGTCAGAAGCCTCAGCTCTTGCCGATGCTTCAGGTGAACCCGACGCCGGAAAACGCCGGCAAATGAAAATCAGCCAGCAGCGCCGGGTTGCCAGGTGGTGCGGAGCCAGCTCAGGTACTGGAGGAGGCGGCGCAACGTCGCCAGCTGTCCGGGCACGAGAAAATCCTGCCGGCGCATAAAGATGGAGACGCTTCATTATGCTATTCTGGCTGCATCGGCGACAAGGTTACGTATCAGAGCGCTTGAACGCTGGCGCAGCAGGCGGATAAATTGCACAGCAGCAACGGTGAAAACGGGCCGCCACTGATGTAGAGAGAAACTCGTGCGCCAGACTGACCAGCAGTGCGGTAGACCGGGAAGCCAAACCGGGTGCCTGAAGGAACAGTATGGCGATAATCTGCTGGCGCCAACGTCATGTCAGGAGCAGAAAAGATCAGGCAGCGAGCACAGCTCGCGGGAATGATAGGCAGGCCCGGTCCGCTGGAGTGAGTGGAAGAGAGCGCCAATGACAGCATGTCGCAAAAAGCAGCCATGCAGACCTTTGGTGATGGTGTGCATTGCAGGTGCGGTGAATATGGCGTGATGCTGACGGCAGTGAGCAGAACTTGGCGGCTTCTGTTCC * RG:Z:1 And the command and error it produced: (it didnt output any file) java -jar ~/tools/picard.jar CollectAlignmentSummaryMetrics R=../LambdaRefGenome.fa I=test2.sam O=testSummary4.txt VALIDATION_STRINGENCY=SILENT
[Sun Jan 15 10:53:12 IST 2017] picard.analysis.CollectAlignmentSummaryMetrics REFERENCE_SEQUENCE=../LambdaRefGenome.fa INPUT=test2.sam OUTPUT=testSummary4.txt VALIDATION_STRINGENCY=SILENT MAX_INSERT_SIZE=100000 EXPECTED_PAIR_ORIENTATIONS=[FR] ADAPTER_SEQUENCE=[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG] METRIC_ACCUMULATION_LEVEL=[ALL_READS] IS_BISULFITE_SEQUENCED=false ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Sun Jan 15 10:53:12 IST 2017] Executing as artemd@nshomron.tau.ac.il on Linux 2.6.32-642.1.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_66-b17; Picard version: 2.8.1-SNAPSHOT
WARNING 2017-01-15 10:53:12 SinglePassSamProgram File reports sort order 'unsorted', assuming it's coordinate sorted anyway.
[Sun Jan 15 10:53:12 IST 2017] picard.analysis.CollectAlignmentSummaryMetrics done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=504889344
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.collectQualityData(AlignmentSummaryMetricsCollector.java:323)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.addRecord(AlignmentSummaryMetricsCollector.java:189)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:121) at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:87)
at picard.metrics.MultiLevelCollectorAllReadsDistributor.acceptRecord(MultiLevelCollector.java:192) at picard.metrics.MultiLevelCollector.acceptRecord(MultiLevelCollector.java:315) at picard.analysis.AlignmentSummaryMetricsCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:83) at picard.analysis.CollectAlignmentSummaryMetrics.acceptRead(CollectAlignmentSummaryMetrics.java:147) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:138) at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:77) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:208) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105) Now the second file [testWithQscore.txt] - with the only thing changed is added (fake) qscore values instead of the asterisk: @HD VN:1.0 SO:unsorted @SQ SN:burn-in LN:48502 @RG ID:1 @PG ID:6 PN:minialign 8915e658-528c-4677-88a8-c2eba6c58fc5_Basecall_2D_2d 16 burn-in 41435 60 31M4D6M2I4M2D11M1D4M1D33M1I1M2D13M1D26M3D4M1D61M2D13M1D11M2D3M1D13M1D1M1D7M1D15M1D10M1D2M2D6M1I3M1D4M1D22M1D31M1D3M1D7M5D10M2D22M1I2M1D16M1I14M1I17M2D12M2I7M1D36M2I2M1D19M1D35M1D39M1D21M1D5M1I78M3D14M2I13M1D8M2D18M2D78M1I11M1I3M2D1M2D2M1I5M1I8M1I36M1D12M1D19M2D10M1I1M1I13M3D7M1D20M1D53M1I27M1I37M1I9M2D139M1I7M2D22M1I19M4D50M1D108M2D26M1I37M2I34M2I9M1I6M1D62M2I11M1I28M1I5M2D49M1D25M1D19M1I10M1D57M1D15M1D17M2I8M1D34M1D15M2D12M1D56M2D9M2I3M4D27M3D12M1D10M1D59M1D29M1D16M2D72M3D8M1D77M1D8M6S * 0 0 CCCTTCACCAAATACTGTGATGAATATATCAAGGGAAAATTACCACGTGGATTGCATCGAGCCGATAAACTGAAGCGGCTAAAGCCAAAGCACGAATCAGATATCTGAAGAACTGTCAGACTTTGAGAAGGATATCTCGCATGGTGGAAGCAATAACCATTCGATTTGCAAATACCGGAACATCTCGGTAACTGCATTCTGCATTAAAAATCAACGCAAAATCGACTTGCCTGCAAAAGAGGAGGATTGCAGCGTGTTTTAATGAGGTCACAGGATCCGCAATGCGGACGGACATCGGGAAACGCCAAGGAGATTATGTACCGAGGAAGAATGTCGCTGACGTATCGCGGTATTCAGAATGATTATCAAGCCCTGTATCAGAGAAGGGTACGAGCTAAAAAGATTCGATACTGGTATTTTGTTCTGAGTCATGAAATACTTGGAGAGGGCAGCTGATTTTGACTTCGGGAGGGAAGCTGCATGATGGGATAAGCATCGGTGCGGTGAATGCAAGAAGATAACCGCTTCCGACCCAATCAACCTTACTGAATCGATGGGGTCTCCGGTGTGAAAGAACACCAACAGGGTGTTACCACTACCGCAGGAAAGGAGGGACGTGTGGCGAGACAGCGACGAAGTATCACCGACATAATCTGCGAAAACTGCAAATACCTTCCAACGAAACGCACCAGTAAACCCAAGCCAACTTGCAAAAAGAATCGACGTAAACCTTCAACTACACGGCTCCTGTGGGATATCCAGTGGCTAAGACGTCGTGCGAGGAAAACAAGGTGATTGACCAAAATCGAAGTTACGAACAAGAAAAGCGTTGAGCAAAGCTAGTCGCGCTTAACTGCGTATTAAAAGCTGCATGTGCTGGAAGTTCACGTGTGTAACACTGCTGCGGAAACTGATGAGCGATCCGAAGCCTGATGCATCAGAGGAAGAAGATGGATAAACAGCGCGAAGACGATGTAAAACGATGAATGCCGGGAATGGTTTCACCCTGCATTCGCTAATCAGTGGTGTTAATACTCCAGAGTGTGGAACCAAGATAGCACCTCGAACGACGAAGTAAAGAACGCGAAAAAGCGGAAAACAGTAGCAGAAGAAACGACGACGAGAGGAGCAGAAACAGAAAGATAAACTTAAGATTCGAAAACTCGCCTTAAAGCCCCGCAGTTACTGGATTAAATAAGCCCAACAAGCTGCAAACGCCTTCATCAGAGAAAGAGACCGCGACTTTCCCATGTATCGTGCGGAACGCTCACGTCTGTTTCAGTGGGATGCCGGACATTGACAACTGCTGCGGCACCTCAACTCCGATTTAATGAACGCAATATTCACAGCAATGCGTGGTGTGCAACCAGCACAAAAGCGGAAATCTCGTTCCGTATCGCGTCGAACTGATTAGCCGCATCGGGCAGGAAGCAGTAGACGAAATCGAATCAAACCAACCGCCATCGCTGGACTATCGAAGAGGTGCAAGGCGATCAAGGCAGAGTACCAACAGAAACTCATAAAGACCTGCGAAATAGTAGAAGTGAGGCCGCATGGGACGTTCTCTTGTAAAACCATTCCAGACATGCTCGTTGAAACATACGGAAATCAGACAGAAGTAGCACGCGGACTGAAAGTTGTAGTCGCGGGTACGGTCAGAAAATACGTTGATGATAAAAGATGGAAATGCACGCCATCGTCAACGACGTTCTCATGGTTCATCGCGGATGGAGGAAAGAGATGCGCTATTACGAAAAATTGATGGCGGCAAATACCGGAAATATTTGGTAGTTAAGGATCTGCACGGATGCTACACGAACCTGATGAACAAACTGGATACGATTGATTCGACAACAAAAAGACCTGCTTATCTCGGTAAGGACGGCTGGTTGATCGTGGTGTAGAGAACGTTGAATGTTTGAATTAATCACATTCCTGGTTCAGAGCTTGCATGGAAACCATGAGCAAATGATGATTGATGGCTTATCAGAGCGTGGAAACGTGTCACTGGCTTAGCTGGCGGTGGCTGGTTCTTTAATCTCGATGACAAAGAAATTTGGCTAAAGCCTTGCCCATAAAGCAGATGAACTTCCGTTAATCATCGAACTGGTGAGCAAAGATAAAAATATGTTATCTGCCACGCCGATTATCCCTTGACGAATACGAGTTTGAAGCCAGTTGATCATCAGCAGGTAATCTGGAACCGCGAACGAATCAGCAACTCGCGCCGTGGGATCGTGAAAATCAAAGTGCGGACACGTTCATCTTTGGTCATACGCCAGCAGTGAAACCACTCAAGTTTGCCAACCAAATGCATATCGATACCATGCAGTGTTGCAAAA 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 RG:Z:1 8915e658-528c-4677-88a8-c2eba6c58fc5_Basecall_2D_template 4 * 0 0 * * 0 0 TTGGCAGATAACATATTTTATCTTTTGCTCACCAGTTCGATGATTAACGGAAGTTCATCTGCTTTATGGG 1111111111111111111111111111111111111111111111111111111111111111111111 RG:Z:1 8da715a9-3717-4f04-9667-e7e0c2792104_Basecall_2D_2d 16 burn-in 9431 60 27M4D17M2D15M1D5M1D7M1D9M1I27M1D17M2D8M6I7M3D2M1D1M1D9M2D27M4D9M3D18M2D18M6D15M1D36M1I9M1D4M1I1M3D20M2D16M2D3M1D7M1D8M1D7M3D4M1D11M1D24M1D37M2D9M1D15M1I16M1I9M1I9M1D29M2D5M1D3M1D11M2I5M3D1M1D4M6D5M2I4M4I5M1I10M1D8M2D6M2I16M1D18M5D4M5I4M1D22M2D2M1D6M1D5M1D9M1I4M1D2M2D15M2D6M1I1M3I6M3D11M1D16M1D23M5D9M4D3M2D1M1D3M1I8M2D15M1D8M3I14M1D7M1D1M1D7M1D9M5D2M2D4M1D9M1D3M1D24M2D11M1D7M2D26M6D7M1D3M1D6M1I14M3D12M3D3M1I19M1D9M1D7M1D4M1I1M3D29M1I3M1D13M2D1M1D23M2D18M1D10M1D7M2D13M1D4M2D3M1D7M1D8M1D11M1D4M2D2M3I3M1D17M3I3M1I5M3D12M1D4M2D15M1D7M2D3M3D8M2D4M4D8M1D11M1D18M1D27M2I23M4I1M1I5M2D3M3D23M1D53M2D3M1D3M4D8M2D13M1D33M4D5M1I7M1D2M1D2M1I5M2D7M1D15M1D8M2D9M1I17M4D12M2D20M1I7M2I16M1D1M1D2M1D5M1D2M1I9M2I14M3I4M1D6M4D4M5I3M1D14M2D10M1D1M2D19M2D6M1D15M1D23M6D4M2I5M1I1M2D15M1D10M2I3M1D3M1D54M1D11M1D44M2D3M3D18M1D25M1I9M1D5M1I4M2D3M2D17M1D10M3D9M1I13M2D18M2D2M1I15M1D3M1I9M5D4M1D2M1I9M2I1M1I2M1I5M1I25M2D8M1D28M1D14M1I6M2I7M1I21M4D33M1D3M1D1M4D4M1D3M2D18M1D4M1D3M1D11M1D1M2D5M2D7M2D21M2D3M5I4M2D7M2D7M1D71M1I14M4D5M1I9M1D23M1D13M2D22M1D38M3I7M1I8M1I13M1I2M3I14M3D7M2I36M4D13M1I9M1D7M1D4M1D6M1D4M1D8M1D2M1D2M1I10M2D5M1D13M1D28M4D5M2D18M3I5M4I6M4I3M2I10M1D9M1D16M1I4M2D5M6S * 0 0 AGCGGCAACCGGCATGACCGTGACGCCAGCACCTCGGTGGTGAAGGCAGAGTACCACGCGACGGGGCCTTCAGCCGGAGGCGCGTAAACGACAAGAGCTTTCGTGCAGGTCTGCGGACAAAACAAGCCACCGTCGGCTTGTCGGTCGGAGACTATCACTGAACGGCGTTGCTGCAGGCCGGGTTATCCGGTGCTCGGTAATGGTGAGTTTGCCGGTTGCAGAAATTACCGCCAGTTAATCCGGAGGTCGACGATGTTCTTAGAAACCGAATCATTTGAACACTAACGTGGTACTGCTGCTTTCTGAACTGTCAGCCCCAGTGCATTGAGCATCGCCTGATGAACGGCGTGCGAACAGGAGTCGACAGCAACCGAAGTTTGCTGTGGAAGATGCCATCGAACCGGCGCGTTTCTGGTGGCGATGTCCCTGTGGCAACCATCCGCGAAGACGCAGATGCCAGTCCATGAATGAAGCCAGTTAAACAGGATTGAGCAGAAGTGCTACCCACCTGGCCGGCGCGCGCATTCACTACGAAAACGTAAAACGCACCTTCTGATGCGCCTTCGCGGATGGTGGAATAATGCCTTGAACAGAGAGGACACGCCGGGCCCGCAAAGCTGAAGCTGCGGGAAAGTGCGGTCATGCGAGCGAGTTTCGCCCTGAAACTGGCGTGGATGGGCGACCGACTGAAAAAGCCAGCGGCTGGAGGTCATCCGGAGTGGTACCGCCGACCACCGCTTTTGAGCACCCATTATTTTCTGATGTTCTGCTGGATATGCACTGGGCTGACGCCGCCATACCTGTTTTAGCGATCCGGATATGATCCGCCGACGGATTTCAGTCTGCCAATGGGCCAGGCTGAGAAGGTTCTTGGGGCGATGCTGCAGCGCAGGGCTTGCCGGAGGTGTCCGCCGGCCCGGACGAAATGAAGACCCCCGCTTCCCAGGATGTGGCGAAGGAGGACACGAATGCCAGATGACAGCATCATGGACTGCAGGAGTCCAGGTATGGCTGGTCCGGTAGCGATTTAATGTTGATTGAGTTCACGCGCTTGGATTTGACGAACAGATGGCGTAGATCAGGCGTCATTTCGGTACGGAAAGTGATGCGAAAAACAGCGGCAGTCGTTGACCGTCGCCGAGCGACAGGCTGGCTGCACAGAAGCGGATTCCGTCGGCGGTAGAAGCCGCCATGCGAGGCCCGGGTGCCGGTTCACCGACGTGGCTGGCCCTGCAGCCAGGGTGGCAAATCCGGCTGATCCTGCCGCACAGGGGGCAGAAGGACTCTCAAATGATCCCATGTTCAGGGGCTTGCCGGTGCGATCACCTGCCGATGGTGGGGGCCACCTCGCTGGTGCGGTGGCGACCGGTGCGCTGGCGGCGCTCATGCCGTAGGGCAACTCAACCCCGTCCGATTCAACAAAACGCTGGTCCTTTCCGGCAATCAGGCGGGACTGACGGCAGATCGTACTGTCCCAGAGCCGCAGGCGGCAGGGCGACGTTTAACCAGACCAGCGAGTCACTCAGCGCGTTAGCGGCGGGGTAGCAGGTGACTCGGGTGCGTCCATCAGCGAGGTGTGGCGTTTCTCTTCCTGCATCCGGCGTGGAGGCAAGGTCGCTGACCTTCCGGAAGCTGACCAGAAGACCCGAGCCGTCAGGGCTGACGGCAGGTCGCGGTGTCCGCAAGGCATGTCGGCGGAGCAGCGGACAGGTATGTCAGAAGCGGTGCGCGTTCCAGCGATGGCCGGGCGATGGGCGGCAGCCGAGGCCGCAGTCAGGTTTGATGACCAGACCGCCGCCTGAAAAAGAACATGGGCGAAAAACCTGGTGGACAGGACTGCGCGGCATTCAAATCAGCAATGGATGCGGTGCTGGATATTGGTCGTCCTGATACCGCGCAGGAGATGCTGATTACGTAGAGGCTGCGGTAAGAAAGCAGACGACATCTGGAATCTGCGCAAGGATGATTATTGTTGATGAAGCGCGGGCGCGTACTGGGATAATCGTGAAAAGGCCCGGTCTTGTGCCGAAGTTCGCCAAAAGGCTGAGCAGCAGACTACCAGGACAATGCGCCGTATGGTGAGCAATACCAGCGTCACGGCTGAAACACCAGAAAGGGCGCAAAAGCTACACGAACAGCACGGGCATGCCGAAAGCGAGTAGCCGCCGCATCAGGAAGAACTGAACAAAACACAAAGACGGAAAATCCTGCAGGCGGATTACAACGCGCGATGGCAGAAGTAAGAAAAGGCGATTATGGCAGCGAGCCGGCGTCGGCTGAATCCAGCGTGAAGGTGTCTGCGGGCGATCGTCAGAAGCCTCAGCTCTTGCCGATGCTTCAGGTGAACCCGACGCCGGAAAACGCCGGCAAATGAAAATCAGCCAGCAGCGCCGGGTTGCCAGGTGGTGCGGAGCCAGCTCAGGTACTGGAGGAGGCGGCGCAACGTCGCCAGCTGTCCGGGCACGAGAAAATCCTGCCGGCGCATAAAGATGGAGACGCTTCATTATGCTATTCTGGCTGCATCGGCGACAAGGTTACGTATCAGAGCGCTTGAACGCTGGCGCAGCAGGCGGATAAATTGCACAGCAGCAACGGTGAAAACGGGCCGCCACTGATGTAGAGAGAAACTCGTGCGCCAGACTGACCAGCAGTGCGGTAGACCGGGAAGCCAAACCGGGTGCCTGAAGGAACAGTATGGCGATAATCTGCTGGCGCCAACGTCATGTCAGGAGCAGAAAAGATCAGGCAGCGAGCACAGCTCGCGGGAATGATAGGCAGGCCCGGTCCGCTGGAGTGAGTGGAAGAGAGCGCCAATGACAGCATGTCGCAAAAAGCAGCCATGCAGACCTTTGGTGATGGTGTGCATTGCAGGTGCGGTGAATATGGCGTGATGCTGACGGCAGTGAGCAGAACTTGGCGGCTTCTGTTCC 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 RG:Z:1 And the command for this one is: (it produced a normal AlignmentSummaryMetrics file) java -jar ~/tools/picard.jar CollectAlignmentSummaryMetrics R=../LambdaRefGenome.fa I=test.sam O=testSummary2.txt VALIDATION_STRINGENCY=SILENT
[Sun Jan 15 10:53:01 IST 2017] picard.analysis.CollectAlignmentSummaryMetrics REFERENCE_SEQUENCE=../LambdaRefGenome.fa INPUT=test.sam OUTPUT=testSummary2.txt VALIDATION_STRINGENCY=SILENT MAX_INSERT_SIZE=100000 EXPECTED_PAIR_ORIENTATIONS=[FR] ADAPTER_SEQUENCE=[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG] METRIC_ACCUMULATION_LEVEL=[ALL_READS] IS_BISULFITE_SEQUENCED=false ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Sun Jan 15 10:53:01 IST 2017] Executing as artemd@nshomron.tau.ac.il on Linux 2.6.32-642.1.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_66-b17; Picard version: 2.8.1-SNAPSHOT
WARNING 2017-01-15 10:53:01 SinglePassSamProgram File reports sort order 'unsorted', assuming it's coordinate sorted anyway.
[Sun Jan 15 10:53:01 IST 2017] picard.analysis.CollectAlignmentSummaryMetrics done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=504889344

Hope this helps.

ArtemD.

Hi @SDFfASF,

I can confirm it's the asterisk that causes a problem. The error stack trace shows that this is the function that's choking on your read:

IndividualAlignmentSummaryMetricsCollector.collectQualityData
`

This function looks up the quality scores by the index position of the corresponding base, so if the array is just a single asterisk, the function will error out for any base after the first. That's why you get an ArrayIndexOutOfBounds as explained here.

The tricky thing is that many Picard tools have requirements that are different from the majority of tools and are often not documented. The metrics collection tools tend to have the most exhaustive requirements for records being complete, because they access most if not all of the properties of the data. We'll try to document these things more clearly in future.

Geraldine Van der Auwera, PhD

• Member Posts: 5

OK, thanks @Geraldine_VdAuwera that clears up a whole lot of confusion. Now I believe that collect summary metrics require the qscore values in order to calculate few metrics "for high quality bases" but can I somehow turn this option off so picard could collect all other metrics not related to quality? OR can I ask picard to assume all bases have the same qscore?

If there is no solution on picard's end I guess I would need to either loop over each read in the sam file and to "fake" qscore values of the same length of the read or (what might be more troublesome to write) for each read go to the original fastq file and place the qscore values from the fastq to the corresponding bases for this read in the sam file.

edited March 9

Hello @Geraldine_VdAuwera

I am not sure but I think that I have the "asterisk problem" with CollectMultipleMetrics in the "MutationCalling_QC_v1-1_BETA_cfg" pipeline:

-INFO 2017-03-09 17:42:01 SinglePassSamProgram Processed 104,000,000 records. Elapsed time: 00:28:44s. Time for last 1,000,000: 15s. Last read position: X:153,045,203
-[Thu Mar 09 17:42:10 UTC 2017] picard.analysis.CollectMultipleMetrics done. Elapsed time: 30.98 minutes.
-Runtime.totalMemory()=1025507328

I am not sure if this is the problem or not because other people who launch the same analysis with non-mapped reads didn't have this kind of error. The picard version is 2.1.0 and the symbol for this type of reads is " * / * ". Do you know if I could do something to fix it?

Thank you for the help,

Hi @elcinchu27, if your data doesn't have qscores you need to add a flat default; see my "accepted" answer at the top of the thread.

Geraldine Van der Auwera, PhD

Hello again @Geraldine_VdAuwera,

As you said, I tried to add flat default values for the qscores but unfortunately I still have the same problem with the pipeline:

java -jar /usr/local/bin/GenomeAnalysisTK.jar \
-R reference.genome \
-I input_bam \
-DBQ 0 \
-o qscores.bam

I wrote "-DBQ 0" because the default value is -1, and there is an error message that said that it is no possible to use negative values. Do you have any idea about the problem? Thanks for the help.

The default -1 value is a special-cased value that disables the use of default quals. I'm not sure about 0 but that might be special-cased too. I would recommend using a more realistic qual value, like 20 or 30 instead.

That being said I don't know whether this will actually solve your problem; I'm assuming that your problem is the same as the original poster's, but if it's a different problem then this won't be sufficient. Did you run ValidateSamFile on this data?

Geraldine Van der Auwera, PhD

Yes, I looked for errors and warnings but the output shows "No errors found":

/usr/lib/jvm/java-1.8.0-openjdk-amd64/bin/java -Xmx2g -jar /opt/picard-tools/picard.jar ValidateSamFile \
I=input_bam \
OUTPUT=output_errors.list \
MODE=VERBOSE \
IGNORE_WARNINGS=true

/usr/lib/jvm/java-1.8.0-openjdk-amd64/bin/java -Xmx2g -jar /opt/picard-tools/picard.jar ValidateSamFile \
I=input_bam \
OUTPUT=\$output_warnings_errors.list \
MODE=VERBOSE

I could try the same with those other values and see if it works or not.

edited March 14

I just checked the input and output files of "PrintReads" with "-DBQ 20" and I don't see any difference between them, because the asterisk is still in the new bam generated by the program. I don´t know why there is no change between both files.

1) input.bam:

HWI-ST731_18:2:1101:10003:49500#8@0 77 * 0 0 * * 0 0 TTTTCCATAATAGACGCAACGCGAGCAGTAGACTCATTCTGTTGATAAGCAAGCATCTCATTTTGTGCATATACTT
????II???I??I?I5???I?I????+55?+?+5?+5+?????I???I############################
PG:Z:MarkDuplicates RG:Z:1271ND
@DDDBDBF<;FF?G::@FG>;GB?@?)00B* B*/9)/)8==BCG;;FH############################

2) qscores.bam:
HWI-ST731_18:2:1101:10003:49500#8@0 77 * 0 0 * * 0 0 TTTTCCATAATAGACGCAACGCGAGCAGTAGACTCATTCTGTTGATAAGCAAGCATCTCATTTTGTGCATATACTT
????II???I??I?I5???I?I????+55?+?+5?+5+?????I???I############################
PG:Z:MarkDuplicates RG:Z:1271ND
@DDDBDBF<;FF?G::@FG>;GB?@?)00B* B*/9)/)8==BCG;;FH############################