Service Notice: Normal service will resume Thursday 28 Jan. Thanks for your patience.

Fishy output from ReduceReads

blakeoftblakeoft ConnecticutPosts: 9Member

I've been following the best practices guide and I've gotten some odd looking output from ReduceReads. Here's a sample:

C100 16 chrM 4934 60 1M * 0 0 T 7 BD:Z:E RG:Z:JC01_L1 BI:Z:L RR:B:c,1 RS:A:1

The odd part is the CIGAR string. Is "1M" a reasonable CIGAR string? Furthermore, prior to ReduceReads, Picard tools' ValidateSamFile finished with no errors, and the validation for the ReduceReads output is like so:

WARNING: Record 1, Read name 1, NM tag (nucleotide differences) is missing

That occurs for records 1 - 100 and then ValidateSamFile does not report any more.

Here is the command line I used for ReduceReads:

java -Xmx2g -jar $GATK -T ReduceReads -R $genomes/hg19.fa -I $alignments/$lane.dedup.realn.recal.bam -o $alignments/$lane.dedup.realn.recal.reduced.bam

Note that pwd is surrounded by back ticks, I just don't know how to disable them from interrupting the code format.

Any advice?


Best Answer


  • blakeoftblakeoft ConnecticutPosts: 9Member

    Thank you for your answer, Geraldine. I misunderstood what ReduceReads actually does. I went back and watched the presentation and it all makes sense now.

Sign In or Register to comment.