Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Changing GATK base quality score input

I am working with low quality data from a non-model organism. The average base quality in my reads is about 28. Are there any flags in GATK4 (or GATK3) that will lower the base quality threshold so that GATK assigns higher GQ values to these bases?

Thank you
Tagged:

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Nicole_S_T

    Can you give me some more data about which GATK tool you are using/need help with?

  • Nicole_S_TNicole_S_T Member
    Hi @bhanuGandham,

    I am first creating gvcfs from the bam files for each sample:
    gatk} --java-options -Xmx1g HaplotypeCaller -R {input.ref} -I {input.bam} -ERC GVCF -O {output}

    Then combining the gvcfs and finally genotyping the gvcfs to create a final vcf
    gatk --java-options -Xmx1g GenotypeGVCFs -R {input.ref} -V {input.gvcf} -O {output}

    Within one of these steps it is possible to adjust the base quality threshold since with my sequencing technology and sample quality none of the bases are above a quality score of 30 which is results in low genotype qualities in my final vcf.

    Thank you,
    Nicole
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited March 7

    Hi @Nicole_S_T

    You could use --standard-min-confidence-threshold-for-calling option in GenotypeGVCFs to set the minimum phred-scaled confidence threshold at which variants should be called.

    or

    You could also use --min-base-quality-score option in HaplotypeCaller to set the Minimum base quality required to consider a base for variant calling.

  • Nicole_S_TNicole_S_T Member
    Thank you @bhanuGandham,

    I will try those flags. To follow-up, I am working with Ion Torrent sequencing data that seems to systematically assign lower base quality scores. Is there a way to let GATK know what my expected range of quality scores is? So that it knows that the highest quality assigned to any base is, for example, 28?

    Best,
    Nicole
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Nicole_S_T

    I am not sure if GATK has a function for that other than the flags i mentioned.

Sign In or Register to comment.