Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

generate figure from BQSR gatk version 4

dear all,

I am using gatk 4 and I would like to create the figures to show the use of the BSQR on my am files.

I created the tables and applied with:
gatk BaseRecalibrator -I .bam -R <ref.fa> \
--known-sites .vcf -O .tab
gatk PrintReads -I .bam-R <ref.fa> \
--bqsr-recal-file .tab -O .bam

I then followed a manual for this step:
gatk BaseRecalibrator -I .bam -R <ref.fa> \
--known-sites .vcf --bqsr-recal-file .tab -O .tab

But I get the error:
A USER ERROR has occurred: bqsr-recal-file is not a recognized option

If I use --BQSR I get
A USER ERROR has occurred: BQSR is not a recognized option

The original command on the manual was for version 3:
java -jar GenomeAnalysisTK.jar \
-T BaseRecalibrator \
-R <ref.fa> \
-I .bam \
-knownSites .vcf \
-BQSR .tab \
-o .tab

followed by:
java -jar GenomeAnalysisTK.jar \
-T AnalyzeCovariates \
-R <ref.fa> \
-before .tab \
-after .tab
-plots .pdf

What would be the correct syntax? Is this procedure maintained from gatk 3 to gatk4?

Thank you

Best Answer

Answers

  • dbeckerdbecker MunichMember ✭✭✭

    Hi,

    You create the recal file using BaseRecalibrator and recalibrate your bam with ApplyBQSR.

    gatk BaseRecalibrator \
                -R ${reference} \
                -I ${sample}.bam \
                -O ${sample}_recal_data.table \
                --known-sites ${knownsite_dbsnp} \
                --known-sites ${knownsite_mills} \
                --known-sites ${knownsite_1KG}
    
    gatk ApplyBQSR \
                -R ${reference} \
                -I ${sample}.bam \
                -O ${sample}_recal_reads.bam \
                --bqsr-recal-file ${sample}_recal_data.table 
    

    I don't create the plots since it takes too much time. But as I understand, you can use BaseRecalibrator on the recalibrated bam to create your second pass table. And plot afterwards.

    gatk BaseRecalibrator \
                -R ${reference} \
                -I ${sample}_recal_reads.bam \
                -O ${sample}_recal_data_second.table \
                --known-sites ${knownsite_dbsnp} \
                --known-sites ${knownsite_mills} \
                --known-sites ${knownsite_1KG}
    
    gatk AnalyzeCovariates \
         -before ${sample}_recal_data.table \
         -after ${sample}_recal_data_second.table \
         -plots AnalyzeCovariates.pdf
    

    Best,
    Daniel

  • GigiuxGigiux Member
    Accepted Answer

    Thank you! Looks like it worked:

    Although it requires the packages ggplot2/gplots/reshape/gsalib from R.

  • dbeckerdbecker MunichMember ✭✭✭
    edited October 2018

    Hi,

    you need a few things except GATK to run the best practices. You can get more information here:

    https://gatkforums.broadinstitute.org/gatk/discussion/2899/howto-install-all-software-packages-required-to-follow-the-gatk-best-practices

    Best,
    Daniel

Sign In or Register to comment.