unified genotyper for pooled data > Allele Number

_soo__soo_ Seoul KoreaMember

Hi,

I want to anaylze pooled sequencing data, and i found out gatk has an option for these in UG: --ploidy.
So, i gave an option --ploidy 100, cuz this pool contains 50 individuals.
Also i added one more option -glm GENERALPLOIDYINDEL for indel calling.

After, i am looking at a raw vcf file, and i dont understand it.
it's because all the AN is just 100.
So, AF, AC and MLEAF values are also weird.

have one more question,
i want to run a few more steps to trim a raw vcf file, using Select Variants and Variant Filtration.
But they dont seem to have any option for ploidy. Then, how can i proceed while not loosing any information about ploidy?
just dont put options like -selectType?

i hope you guys have solutions for my problem.
Thank you, in advance.

Soo

Answers

  • delangeldelangel Broad InstituteMember

    Hi there - I don't quite understand your question. If you run with -ploidy 100 then AN should always be also = 100 (unless you don't have coverage in a site) because that's the number of chromosomes that you are calling together. Similarly, AC will be the estimated alt. allele count in your pool, so it should be always between 0 and 100.
    SelectVariants and VariantFiltration are not ploidy-aware, they just work with all variants. If you have an input vcf that comes from a higher ploidy individual, the output will be the same.

Sign In or Register to comment.