We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Downsampling in GenotypeGVCFs

MattBMattB NewcastleMember ✭✭

The documentation for GenotypeGVCFs states that it is employing downsampling:

Downsampling settings
This tool applies the following downsampling settings by default.

To coverage: 1,000

I also see this in my output :"INFO 14:51:47,705 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000"

Specifically what is it downsampling, the genotype likelihoods per locus or active region? Is this behaviour changeable via -dcov, or would changing it be inadvisable?

Issue · Github
by Sheila

Issue Number
Last Updated
Closed By


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    Apologies for the confusion. GenotypeGVCFs doesn't actually do any downsampling; what you're seeing, both in the documentation and in the console output, are artifacts of how the code is structured. In programming terms, GenotypeGVCFs is a subtype of LocusWalker, ie a tool that traverses the genome by position one by one. Because of that, it inherits properties of LocusWalkers, such as the ability to downsample reads, even though it does not use those properties itself.

    So this is not something you need to worry about or do anything about, it is meaningless. I'll see if we can clean this up to avoid confusion in the future, but it may be difficult to allocate effort to doing this since the effect is harmless and we have other more important development priorities.

  • MattBMattB NewcastleMember ✭✭

    Hey thanks for clearing that up, I did have a sneaking suspicion that it was an inherited artefact, but always find sneaking suspicions the hardest to ignore.

Sign In or Register to comment.