generate figure from BQSR gatk version 4

dear all,

I am using gatk 4 and I would like to create the figures to show the use of the BSQR on my am files.

I created the tables and applied with:
gatk BaseRecalibrator -I .bam -R <ref.fa> \
--known-sites .vcf -O .tab
gatk PrintReads -I .bam-R <ref.fa> \
--bqsr-recal-file .tab -O .bam

I then followed a manual for this step:
gatk BaseRecalibrator -I .bam -R <ref.fa> \
--known-sites .vcf --bqsr-recal-file .tab -O .tab

But I get the error:
A USER ERROR has occurred: bqsr-recal-file is not a recognized option

If I use --BQSR I get
A USER ERROR has occurred: BQSR is not a recognized option

The original command on the manual was for version 3:
java -jar GenomeAnalysisTK.jar \
-T BaseRecalibrator \
-R <ref.fa> \
-I .bam \
-knownSites .vcf \
-BQSR .tab \
-o .tab

followed by:
java -jar GenomeAnalysisTK.jar \
-T AnalyzeCovariates \
-R <ref.fa> \
-before .tab \
-after .tab
-plots .pdf

What would be the correct syntax? Is this procedure maintained from gatk 3 to gatk4?

Thank you

Best Answer

Answers

  • dbeckerdbecker MunichMember ✭✭

    Hi,

    You create the recal file using BaseRecalibrator and recalibrate your bam with ApplyBQSR.

    gatk BaseRecalibrator \
                -R ${reference} \
                -I ${sample}.bam \
                -O ${sample}_recal_data.table \
                --known-sites ${knownsite_dbsnp} \
                --known-sites ${knownsite_mills} \
                --known-sites ${knownsite_1KG}
    
    gatk ApplyBQSR \
                -R ${reference} \
                -I ${sample}.bam \
                -O ${sample}_recal_reads.bam \
                --bqsr-recal-file ${sample}_recal_data.table 
    

    I don't create the plots since it takes too much time. But as I understand, you can use BaseRecalibrator on the recalibrated bam to create your second pass table. And plot afterwards.

    gatk BaseRecalibrator \
                -R ${reference} \
                -I ${sample}_recal_reads.bam \
                -O ${sample}_recal_data_second.table \
                --known-sites ${knownsite_dbsnp} \
                --known-sites ${knownsite_mills} \
                --known-sites ${knownsite_1KG}
    
    gatk AnalyzeCovariates \
         -before ${sample}_recal_data.table \
         -after ${sample}_recal_data_second.table \
         -plots AnalyzeCovariates.pdf
    

    Best,
    Daniel

  • GigiuxGigiux Member
    Accepted Answer

    Thank you! Looks like it worked:

    Although it requires the packages ggplot2/gplots/reshape/gsalib from R.

  • dbeckerdbecker MunichMember ✭✭
    edited October 2018

    Hi,

    you need a few things except GATK to run the best practices. You can get more information here:

    https://gatkforums.broadinstitute.org/gatk/discussion/2899/howto-install-all-software-packages-required-to-follow-the-gatk-best-practices

    Best,
    Daniel

Sign In or Register to comment.