If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Question about BQSR
I have 100 samples and I followed the best practices. Every step is fine but when I was using the PrintReads it turned up to be an error.
Version is 3.8
The commands I ran,
$BWA mem -t 4 -aM -R '@RG\tID:seq103\tSM:seq103\tPL:ILLUMINA\tLB:seq103' $GENOME 37_1_paird.fq 37_2_paird.fq > seq103.sam
java -XX:+UseSerialGC -jar $ReorderSam.jar I=$sample.sam O=$sample-reorder.sam R=$GENOME
java -jar $SamFormatConverter.jar I=$sample-reorder.sam O=$sample.bam
java -jar $SortSam.jar I=$sample.bam O=$sample-sort.bam SO=coordinate
java -jar $MarkDuplicates.jar MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=2000 I=$sample-sort.bam O=$sample-md.bam M=$sample-md.m
samtools index $sample-md.bam
java -Xmx10240m -jar $GATK -T HaplotypeCaller -R $GENOME -I $sample-md.bam -stand_call_conf 30 --emitRefConfidence GVCF -o $sample.raw1.g.vcf
java -jar $GATK -T BaseRecalibrator -R $GENOME -I seq103-md.bam -knownSites seq103.raw1.g.vcf -o seq103-BQSR.1.grp -bqsrBAQGOP 30 -nct 2
java -jar $GATK -T PrintReads -R $GENOME -I seq103-md.bam -BQSR seq103-BQSR.1.grp -o seq103.b1.bam -nct 2
It stopped at the beginning of PrintReads . The error message is ,
ERROR MESSAGE: SAM/BAM/CRAM file [email protected]f6a5cc9 is malformed. Please see https://software.broadinstitute.org/gatk/documentation/article?id=1317for more information.
Error details: the BAM file has a read with no stored bases (i.e. it uses '*') which is not supported in the GATK; see the --filter_bases_not_stored argument. Offender: K00132:80:H3WVJBBXX:5:1106:19258:44746
So why the bam files has * ? Is there something wrong with my commands?