GenotypeGVCFs: no records in VCF

kirill84kirill84 CanadaMember
edited July 2017 in Ask the GATK team

Dear GATK team,

I am having troubles calling genotypes on *.gvcf produced by HaplotypeCaller in GVCF mode.
When I run GenotypeGVCFs (GATK 3.5), I get only header in resulting VCF file, but no records.
I had no such problem before.

Could you advice on possible reason of the issue and how to fix it?

Here is the command and output:

java -Xmx12g  -Djava.io.tmpdir=./tmp -jar GenomeAnalysisTK.jar \
    -T GenotypeGVCFs \
    -R reference.fa \
    --variant  sample1.g.vcf \
    --variant  sample2.g.vcf \ 
    --variant  sample3.g.vcf \
    --variant  sample4.g.vcf --variant  sample5.g.vcf --variant  sample6.g.vcf --variant  sample7.g.vcf --variant  sample7.g.vcf \
    --num_threads 4 \
-o TEST.gt.vcf

note: there are SNPs/INDELs in sample*.g.vcf

##fileformat=VCFv4.2
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional reference genotype confidence, encoded as a phred quality -10*log10 p(genotype call is wrong)">
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
...
...
...
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sample1    sample2    sample3    sample4    sample5    sample6    sample7    sample8

Thank you!

Post edited by shlee on
Tagged:

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator

    Hi @kirill84,

    Can you try running the command without the --num_threads threading and see if you still get an empty output? Also, be sure to use the latest release for GenotypeGVCFs--either v3.7 or GATK4-BETA to rule out version specific bugs that may have been subsequently fixed.

Sign In or Register to comment.