Question about BQSR

I have 100 samples and I followed the best practices. Every step is fine but when I was using the PrintReads it turned up to be an error.

Version is 3.8

The commands I ran,
$BWA mem -t 4 -aM -R '@RG\tID:seq103\tSM:seq103\tPL:ILLUMINA\tLB:seq103' $GENOME 37_1_paird.fq 37_2_paird.fq > seq103.sam
java -XX:+UseSerialGC -jar $ReorderSam.jar I=$sample.sam O=$sample-reorder.sam R=$GENOME
java -jar $SamFormatConverter.jar I=$sample-reorder.sam O=$sample.bam
java -jar $SortSam.jar I=$sample.bam O=$sample-sort.bam SO=coordinate
java -jar $MarkDuplicates.jar MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=2000 I=$sample-sort.bam O=$sample-md.bam M=$sample-md.m
samtools index $sample-md.bam
java -Xmx10240m -jar $GATK -T HaplotypeCaller -R $GENOME -I $sample-md.bam -stand_call_conf 30 --emitRefConfidence GVCF -o $sample.raw1.g.vcf
java -jar $GATK -T BaseRecalibrator -R $GENOME -I seq103-md.bam -knownSites seq103.raw1.g.vcf -o seq103-BQSR.1.grp -bqsrBAQGOP 30 -nct 2
java -jar $GATK -T PrintReads -R $GENOME -I seq103-md.bam -BQSR seq103-BQSR.1.grp -o seq103.b1.bam -nct 2

It stopped at the beginning of PrintReads . The error message is ,

ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@4f6a5cc9 is malformed. Please see more information.

Error details: the BAM file has a read with no stored bases (i.e. it uses '*') which is not supported in the GATK; see the --filter_bases_not_stored argument. Offender: K00132:80:H3WVJBBXX:5:1106:19258:44746

So why the bam files has * ? Is there something wrong with my commands?
Thanks !


Best Answer


  • SheilaSheila Broad InstituteMember, Broadie, Moderator


    Can you retrace your steps and find out exactly when the issue occurred? Can you try validating your input BAM file at each step with ValidateSamFile and letting us know which step throws an error? It looks like BaseRecalibrator from your post, but I just want to make sure the error did not arise in a previous step.


  • Thanks, Sheila!
    I think the error arose in the previous steps.
    Yesterday I tried to add two steps before the BaseRecalibrator and it worked.
    The steps I ran are RealignerTargetCreator and IndelRealigner.
    Command line,
    java -jar $GATK -T RealignerTargetCreator -R $GENOME -I $sample-md.bam -o $sample-md.intervals
    java -jar $GATK -T IndelRealigner -R $GENOME -filterNoBases -targetIntervals $sample-md.intervals -I $sample-md.bam -o $sample-md_rl.bam
    Then I used the new bam file to run BQSR. It didn't post any error.

    But I think the two steps are not nessary for the best practices, right?


Sign In or Register to comment.