To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

unified genotyper for pooled data > Allele Number

_soo__soo_ Seoul KoreaMember


I want to anaylze pooled sequencing data, and i found out gatk has an option for these in UG: --ploidy.
So, i gave an option --ploidy 100, cuz this pool contains 50 individuals.
Also i added one more option -glm GENERALPLOIDYINDEL for indel calling.

After, i am looking at a raw vcf file, and i dont understand it.
it's because all the AN is just 100.
So, AF, AC and MLEAF values are also weird.

have one more question,
i want to run a few more steps to trim a raw vcf file, using Select Variants and Variant Filtration.
But they dont seem to have any option for ploidy. Then, how can i proceed while not loosing any information about ploidy?
just dont put options like -selectType?

i hope you guys have solutions for my problem.
Thank you, in advance.



  • delangeldelangel Broad InstituteMember

    Hi there - I don't quite understand your question. If you run with -ploidy 100 then AN should always be also = 100 (unless you don't have coverage in a site) because that's the number of chromosomes that you are calling together. Similarly, AC will be the estimated alt. allele count in your pool, so it should be always between 0 and 100.
    SelectVariants and VariantFiltration are not ploidy-aware, they just work with all variants. If you have an input vcf that comes from a higher ploidy individual, the output will be the same.

Sign In or Register to comment.