Picard MarkDuplicates string did not start with a parseable number

ryanfriedman22ryanfriedman22 Washington University in St. LouisMember

I'm running RNA-seq data through the GATK Pipeline for RNA-seq variant calling and am getting an AbstractOpticalDuplicateFinderCommandLineProgram Warning while running MarkDuplicates with the message

A field field parsed out of a read name was expected to contain an integer and did not. Read name: C1n.EXACT.TTAT.13482648. Cause: String 'C1n.EXACT.TTAT.13482648' did not start with a parsable number.

I'm not quite sure how this is the case, since when I run the command

samtools view rg_added_sorted.bam | grep C1n.EXACT.TTAT.13482648

the output is:

C1n.EXACT.TTAT.13482648 16 chr1 5196 0 32M * 0 0 TTCGAGATGAACAGCTTGGAGTTCATCAGAGG * RG:Z:id NH:i:7 HI:i:1 nM:i:0 AS:i:31
C1n.EXACT.TTAT.13482648 272 chr3 1813 0 32M * 0 0 TTCGAGATGAACAGCTTGGAGTTCATCAGAGG * RG:Z:id NH:i:7 HI:i:3 nM:i:0 AS:i:31
C1n.EXACT.TTAT.13482648 272 chr3 9145 0 32M * 0 0 TTCGAGATGAACAGCTTGGAGTTCATCAGAGG * RG:Z:id NH:i:7 HI:i:4 nM:i:0 AS:i:31
C1n.EXACT.TTAT.13482648 272 chr4 1347 0 32M * 0 0 TTCGAGATGAACAGCTTGGAGTTCATCAGAGG * RG:Z:id NH:i:7 HI:i:6 nM:i:0 AS:i:31
C1n.EXACT.TTAT.13482648 272 chr10 691 0 32M * 0 0 TTCGAGATGAACAGCTTGGAGTTCATCAGAGG * RG:Z:id NH:i:7 HI:i:7 nM:i:0 AS:i:31
C1n.EXACT.TTAT.13482648 256 chr10 1054683 0 32M * 0 0 CCTCTGATGAACTCCAAGCTGTTCATCTCGAA * RG:Z:id NH:i:7 HI:i:2 nM:i:0 AS:i:31
C1n.EXACT.TTAT.13482648 272 chr11 1443 0 32M * 0 0 TTCGAGATGAACAGCTTGGAGTTCATCAGAGG * RG:Z:id NH:i:7 HI:i:5 nM:i:0 AS:i:31

I'm running the following command as a part of a job on a SLURM cluster using JVM build 25.31-b07, mixed mode

java -Xmx8G -Xms8G -jar $PICARD_HOME/picard.jar MarkDuplicates I=rg_added_sorted.bam O=dedupped.bam CREATE_INDEX=true VALIDATION_STRINGENCY=SILENT M=output.metrics

There's no stack trace since it's just a warning, but it causes problems later when I run SplitNCigarReads, saying it's malformed.

Best Answer

Answers

Sign In or Register to comment.