Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

GenotypeGVCFs

lawallawal United KingdomMember

I was trying to do combine sets of vcf files for all my samples so that I have one single vcf output using this command option below
java -d64 -Xmx48g -jar ${GATK}/GenomeAnalysisTK.jar \
-R ${REF} \
-T GenotypeGVCFs \
--variant A.g.vcf \
--variant B.g.vcf \
--variant C.g.vcf \
-stand_emit_conf 30 \
-stand_call_conf 30 \
-o genotype.vcf

but I got this error message
“The following invalid GT allele index was encountered in the file: END=21994810”. I have tried to locate where the problem could be coming from but I do not understand this. Could you please advise me.

Best Answers

  • SheilaSheila Broad Institute admin
    Accepted Answer

    @lawal
    Hi,

    So, the line you posted is from the re-generated GVCF? The issue is that instead of the GT field, there is an END position.

    Did you restart the Haplotype Caller from the beginning when you ran out of disk space? Can you confirm you are using the latest version of GATK? You may just need to run Haplotype Caller on sample A again to get a clean GVCF.

    Thanks,
    Sheila

Answers

  • lawallawal United KingdomMember

    Thank you Tommy. I found this in A.g.vcf only. I remember I ran out of disk space along the line but I had to create more space later and re-generated the A.g.vcf.

    1 21991582 . T . . END=21991582 GT:. END=21994810 GT:DP:GQ:MIN_DP:PL 0/0:36:96:35:0,96,1440

  • SheilaSheila Broad InstituteMember, Broadie admin
    Accepted Answer

    @lawal
    Hi,

    So, the line you posted is from the re-generated GVCF? The issue is that instead of the GT field, there is an END position.

    Did you restart the Haplotype Caller from the beginning when you ran out of disk space? Can you confirm you are using the latest version of GATK? You may just need to run Haplotype Caller on sample A again to get a clean GVCF.

    Thanks,
    Sheila

  • lawallawal United KingdomMember

    @Sheila yes i am using the latest GATK version and I did restart the Haplotype Caller from the beginning. @Geraldine_VdAuwera, thank you and I will just redo the job as advised to get clean job.

Sign In or Register to comment.