The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Download the latest Picard release at
GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

Aneuploidy samples

HideoHideo Member
edited July 2012 in Ask the GATK team

Leishmania has 36 chromosomes but their copy number is unpredictable for each strain and chromosome copy number can change very quickly. So what is an optimal ploidy setting for organisms with extensive aneuploidy? So far we use just diploid setting. Some samples have consistently more heterozygous SNPs in higher copy chromosomes but this relationship does not hold in many other samples: there is no strong correlation between chromosome copy number and abundance of heterozygous SNPs.

Post edited by Carneiro on


  • HideoHideo Member

    Or more specifically, can we change ploidy setting for each chromosome while detecting variations?

  • Mark_DePristoMark_DePristo Broad InstituteMember

    Guillermo may chime in but I believe you will have to call each chromosome separately with a different ploidy setting in ug. This would generalize to any intervals. If it were me I'd create intervals of haploid copy number, diploid, etc and then call these with the ug with -L and combine the resulting VCf. We need to make this more convenient in the future

  • delangeldelangel Broad InstituteMember

    Indeed - the current use case for the -ploidy argument in UnifiedGenotyper is to assume a single ploidy throughout. As Mark said, you should call each chromosome (or interval, or set of chromosomes sharing same ploidy) separately using different -ploidy arguments.

  • Thank you for the replies. So in principle, population analysis of Leishmania, which has extensive aneuploidy, does not make sense since a chromosome can have very different copy number among a population of parasites.
    Also, our experiments suggest that ploidy is changing within several generations in Leishmania so it is difficult to come up with a proper model.

  • In a coming version, is it possible to GATK to automatically adjust ploidy value for each chromosome if a user provide the most abundant ploidy status? For reasonable samples, it is easy to determine ploidy value for a chromosome just from its median read depth. I do not think there are many organisms that suffer ubiquitous aneuploidy but if there are ones, then this would be good.
    [First check the depth, then assign ploidy value for each chromosome and then do analysis ...]
    But, biologically speaking, aneuploidy is so ubiquitous then SNPs are probably dominated by diploid/monosomy status since extra hetro SNPs will be washed away. I think that is the case for Leishmania.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Hideo,

    That's an interesting feature idea. Right now we don't have the resources to make it a priority, but if you or someone else wants to implement it and send us a patch, we'd be happy to check it out and consider including it in a future release.

  • Since I have real data of over 200 samples with aneuploidy, I can possibly write it if I know the guide line and some one can tell me which "section of programs of GATK" to check for this to start out. But does it needs to be in java? I can use java but have not used it for long time. It seems a module can be really short. Get a median depth from a portion of genome that is longer than any possible indels (1Mb?) and then assign ploidy values for chromosome after properly normalising them. [Properly normalising means just dividing each chromosome depth by a median depth of all chromosomes. It is discussed in our paper ] It is very simple and it works most of the time. ]

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    edited August 2012

    I'm afraid it has to be in java, yes. See the new Developer Zone category, we have started migrating the existing developer documentation there. Hopefully it should be enough to get you started.

    As a caveat, some articles may need to be updated slightly, so if you have trouble finding something referenced in the articles, or some commands don't give the expected results, please post a comment on those articles and we will check/update as necessary.

    Good luck!

Sign In or Register to comment.