Unified genotyper for less than 1 X coverage

Is it possible to call variants using unifiedgenotyper from data with less than 1 X coverage per sample? We have a total of 25 samples and each has a coverage of < 1 X. does changing dcov parameter to 50 or lower work?

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie


    First, you should consider using HaplotypeCaller instead of UnifiedGenotyper, it is a much more accurate variant caller.

    Second, what do you mean by <1x coverage? Can you describe your experimental design in a little more detail? That will help us give you the appropriate recommendations for your analysis.

  • gdalgdal Member

    Thank you. Which parameters should we use for these data when running haplotypecaller? Samples were sequenced on 2 lanes using illumina platform in a paired-end manner. We merged the paired-end reads and mapped them to the human reference genome using bwa, and the average coverage of the whole genome per sample is approximately 1 X. Do you have any reccomendations that will allow us to use gatk with these data? or is it impossible with such very low coverage data?

  • yzqheartyzqheart ChinaMember

    @Geraldine_VdAuwera said:
    To be honest that's much lower coverage than what our tools are designed to handle -- but it might work if you call variants on all samples together and use very low confidence thresholds. To do this, just supply all of your samples together (either to separate -I arguments or as a list of samples to one -I argument) to HaplotypeCaller, run in normal multisample mode (don't use the -ERC settings that are described in the Best Practices recommendations) and set -stand_call_conf 10 -stand_emit_conf 0. You may need to experiment with the values of these last two settings to find the appropriate values for your project. You may also need to use some more advanced arguments like --minPruning to recover enough sensitivity. You can read more about them here:


    I also have 6 low coverage (each 2X) samples. I ran gatk in normal multisample mode (no -ERC settings and supply samples together with -I argument) as you described. After that, I want to know each sample's genotype. Can I use GenotypeGVCFs to get each sample's genotype ? Can GenotypeGVCFs be used in HC normal mode ?


  • SheilaSheila Broad InstituteMember, Broadie, Moderator


    The output of HaplotypeCaller in normal mode is a VCF which contains the genotypes. Have a look at this document that describes the VCF content.


