Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Joint Genotyping

Hi Team,

this is a followup of /discussion/5304/haplotypecaller-treatment-of-scaffolds

I am using Joint Genotyping on a row of ~80 individuals (scaffoldwise).
I am using -nt 16

I have following problems:

A| I'm getting a lot of these warnings (I didn't have so many when I did Haplotype Caller) - Do I need to worry?

ExactAFCalculator - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at scaffold_3:4434354 has 7 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument

B| The job ends with following

INFO 04:11:46,918 ProgressMeter - scaffold_3:6037801 5037839.0 2.3 h 27.1 m 10.4% 21.8 h 19.5 h
INFO 04:11:54,984 ProgressMeter - done 6037839.0 2.3 h 22.6 m 10.4% 21.8 h 19.5 h
INFO 04:11:54,984 ProgressMeter - Total runtime 8198.11 secs, 136.64 min, 2.28 hours

So: Time elapsed 2.3h and done, but 19.5 h to go. Is there a problem with multithreading?

Thanks!
Alexander

Best Answer

Answers

  • robertbrobertb torontoMember

    I'm using GATK version 3.5 and HaplotypeCaller is giving me the same warning though I'm calling a single low coverage sample taken from 1000GP. Is that still to be expected???? Thanks.

    WARN 20:19:27,210 ExactAFCalculator - this tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at chr2:89853227 has 9 alternate alleles so only the top alleles will be used --max_alter nate_alleles argument.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @robertb
    Hi,

    There is no need to worry about the WARN message. It's just letting you know there are more than 6 alternate alleles at the position. You can have a look at the position in IGV to see what those alternate alleles are, or you can set the --max_alternate_alleles value to a higher number. HaplotypeCaller is designed to be very sensitive, so any potential alternate alleles are reported in the GVCF. The default value is set to keep run time shorter.

    -Sheila

Sign In or Register to comment.