Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How can I run BaseRecalibrator with an empty VCF file?

Dear all,
I have aligned my sequences against a made up genome composed of different genomes. Now I need to re-map (BQSR) the alignments using GATK. The command to do so is:
gatk BaseRecalibrator \
-R {ref}.fa \
-I {deduplicated_alignment}.bam \
-O {deduplicated_alignment}_recalibration.table \
--known-sites {ref}.vcf

Since the reference genome is essentially fake, there is no data on genome variability (or better: it will take years to find out all the publications on genetic variability of the many genomes I have pasted together).
Can I run this step WITHOUT the VCF file? Or does it make sense to create an empty VCF file? and in that case, what values should I give to the different columns of the file?
Thank you

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Gigiux

    BQSR algorithm treats every reference mismatch as an indication of error. However, real genetic variation is expected to mismatch the reference, so it is critical that a database of known polymorphic sites(known sites) is given to the tool in order to skip over those sites. If you gave it an empty vcf file, the purpose of using BQSR becomes redundant.

    Regards
    Bhanu

  • GigiuxGigiux Member

    I understand that, but as I said there are no variants for the reference file, thus the VCF would be empty. Shall I skip the BQSR altogether or is there another way of re-mapping the reads that does not require a VCF file? Thanks

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Gigiux

    BQSR does not do re-mapping. It merely recalculates the base quality score to give better and more accurate scores.
    Having said that, if you do not have known sites information then its best to skip the BSQR step. This comes with the caveat that you will encounter false positives.

    Regards
    Bhanu

Sign In or Register to comment.