CombineGVCFs looses Genotypes

dbeckerdbecker MunichMember ✭✭✭


I want to merge the g.vcf files I get from HaplotypeCaller using CombineGVCFs. When I do that, the called genotypes vanish. The g.vcf of one sample before merging:

NC_000001       13273   .       G       C,<NON_REF>     1712.77 .       

and the same variant in the merged file (It's the second sample):

NC_000001       13273   .       G       C,<NON_REF>     .       .      
./.:.:0:0:0:0,0,0,0,0,0 ./.:.:54:99:35:0,102,1268,102,1268,1268

The GATK commandline is:

/opt/gatk/ --java-options -Xmx32G CombineGVCFs
-R GRCh38_latest_genomic_final.fa
-V 17450281-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17380470-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17470830-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17470788-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17370765-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17370767-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17370768-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-O cohort.g.vcf

When I run ValidateVariants on the merged file I get the following Error:

A USER ERROR has occurred: 
Input /srv/nfs/ngsdata/GATK/171218_NS500396_0299_AHYV7NBGX3/_gatk/cohort.g.vcf 
fails strict validation: one or more of the ALT allele(s) for the record at position NC_000001:13273 are not observed at all in the sample genotypes of type

Any ideas?

Thanks and best regards,

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @dbecker,

    Genotype with GenotypeGVCFs. Given your small number of samples, you can skip CombineGVCFs and directly genotype all of your sample gvcfs with GenotypeGVCFs.

  • dbeckerdbecker MunichMember ✭✭✭


    so it is normal, that the called genotypes vanish? Why is ValidateVariants giving me an error then?

    Also we have a much larger number of samples, but I always merge my current run and then merge it to our global cohort. Is this a bad approach?


  • dbeckerdbecker MunichMember ✭✭✭


    I'm actually on the GATK workshop in Montreal at the moment and therefore have no access to my data.
    Every command I used is GATK4.0, but I'm pretty sure I missed the --validate-GVCF option. So I'll try that when I'm back in the office in April.


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    @dbecker, I hope you are enjoying the workshop.

  • dbeckerdbecker MunichMember ✭✭✭

    I don't get the Error anymore using the --validate-GVCF option. Thanks! Now I have a new Problem, but I already found a bug report. https://github.com/broadinstitute/gatk/issues/4525. I'm looking forward to a fix to that.


  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Daniel,

    The fix should be in soon. A developer is working on it currently.


