Attention:
The frontline support team will be slow on the forum because we are occupied with the GATK Workshop on March 21st and 22nd 2019. We will be back and more available to answer questions on the forum on March 25th 2019.

vcf invalid GT allele index

Hi GATK team,
I tried to subset vcf file using SelectVariants, but I got the error: The following invalid GT allele index was encountered in the file: "0. The subsetted vcf has only header lines and record header, no variant site records. I then used ValidateVariants to check the vcf file, and things seem ok. I am struggling to understand what's wrong with my vcf. Can you please help? :)

Best Answer

Answers

  • fengtaofengtao Member
    > @bhanuGandham said:
    > HI @fengtao
    >
    > Please post the exact command you are using, the entire error log and the version of gatk you are using.
    > Thank you.
    >
    > Regards
    > Bhanu

    Hi Bhanu,

    I am using gatk-4.0.8.1.

    My command is:
    gatk SelectVariants -R ~/hulianlian_2018.12/RcScaffold28543.fa --variant ~/hulianlian_2018.12/final.pass.SNP28543.t0304_26283to33795.vcf -O ~/hulianlian_2018.12/genotype_vcf/masaimala -sn masaimala

    The error message is:

    htsjdk.tribble.TribbleException$InternalCodecException: The following invalid GT allele index was encountered in the file: "0
    at htsjdk.variant.vcf.AbstractVCFCodec.oneAllele(AbstractVCFCodec.java:476)
    at htsjdk.variant.vcf.AbstractVCFCodec.parseGenotypeAlleles(AbstractVCFCodec.java:500)
    at htsjdk.variant.vcf.AbstractVCFCodec.createGenotypeMap(AbstractVCFCodec.java:743)
    at htsjdk.variant.vcf.AbstractVCFCodec$LazyVCFGenotypesParser.parse(AbstractVCFCodec.java:132)
    at htsjdk.variant.variantcontext.LazyGenotypesContext.decode(LazyGenotypesContext.java:158)
    at htsjdk.variant.variantcontext.LazyGenotypesContext.getGenotypes(LazyGenotypesContext.java:148)
    at htsjdk.variant.variantcontext.GenotypesContext.iterator(GenotypesContext.java:465)
    at org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants.initalizeAlleleAnyploidIndicesCache(SelectVariants.java:624)
    at org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants.apply(SelectVariants.java:563)
    at org.broadinstitute.hellbender.engine.VariantWalkerBase.lambda$traverse$0(VariantWalkerBase.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at org.broadinstitute.hellbender.engine.VariantWalkerBase.traverse(VariantWalkerBase.java:149)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:979)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:182)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:201)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)

    Thanks,
    Tao
  • fengtaofengtao Member
    > @bhanuGandham said:
    > HI @fengtao
    >
    > Please post the exact command you are using, the entire error log and the version of gatk you are using.
    > Thank you.
    >
    > Regards
    > Bhanu

    Hi Bhanu,

    I finally figured out the problems of my vcf. It relates to the incomplete header lines.

    Another question:

    I want to generate fasta sequences for all genotypes from a vcf file which contains more than 400 samples. I noticed a post in biostar doing the job , but it needs to run the process for each individual genotype, 400 times in my case. Are there any solutions to do this in a collective way?

    Best regards,
    Tao
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @fengtao

    Another user had a similar question. Please follow this thread for the suggested solution.
    https://gatkforums.broadinstitute.org/gatk/discussion/8035/vcf-to-fasta

Sign In or Register to comment.