Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

Regards
GATK Staff

GenotypeGVCFs: no records in VCF

kirill84kirill84 CanadaMember
edited July 2017 in Ask the GATK team

Dear GATK team,

I am having troubles calling genotypes on *.gvcf produced by HaplotypeCaller in GVCF mode.
When I run GenotypeGVCFs (GATK 3.5), I get only header in resulting VCF file, but no records.
I had no such problem before.

Could you advice on possible reason of the issue and how to fix it?

Here is the command and output:

java -Xmx12g  -Djava.io.tmpdir=./tmp -jar GenomeAnalysisTK.jar \
    -T GenotypeGVCFs \
    -R reference.fa \
    --variant  sample1.g.vcf \
    --variant  sample2.g.vcf \ 
    --variant  sample3.g.vcf \
    --variant  sample4.g.vcf --variant  sample5.g.vcf --variant  sample6.g.vcf --variant  sample7.g.vcf --variant  sample7.g.vcf \
    --num_threads 4 \
-o TEST.gt.vcf

note: there are SNPs/INDELs in sample*.g.vcf

##fileformat=VCFv4.2
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional reference genotype confidence, encoded as a phred quality -10*log10 p(genotype call is wrong)">
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
...
...
...
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sample1    sample2    sample3    sample4    sample5    sample6    sample7    sample8

Thank you!

Post edited by shlee on
Tagged:

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Hi @kirill84,

    Can you try running the command without the --num_threads threading and see if you still get an empty output? Also, be sure to use the latest release for GenotypeGVCFs--either v3.7 or GATK4-BETA to rule out version specific bugs that may have been subsequently fixed.

Sign In or Register to comment.