Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

Please add an explicit type tag :NAME

Hi,

I am using the VariantsToTable walker to convert my vcfs to tabular format. However, I keep getting the following error:

Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types

The problem is that I don't understand what is meant by the advice. How do you actually provide the :NAME tag in the command line - I have tried a number of ways and nothing seems to work and I can't find any reference to this tag in the documentation.

Best wishes,

Kath

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Kath,

    You would do e.g. --variant:OLDDBSNP dbSNP128.txt.

  • KathKath Member

    Thanks for that. Now is it showing the following error:

    Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file

    But I have checked and the file does have a correct header line starting #CHROM. The VCF has been filtered using SelectVariants - might this have introduced something into the header that VariantsToTable doesn't like?

    Kath

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    You mean the VCF output by SelectVariants is causing the issue? That's worrying -- maybe something went wrong at the SelectVariants step. You definitely wouldn't have needed to specify the type for a regular VCF. Did you do any processing on that VCF file with a non-GATK tool? And can you tell me what version you're using?

  • KathKath Member

    The only non-GATK tools I have used on the VCFs are snpeff and snpsift, which don't normally cause problems with VariantsToTable. I'm using v2.7-2 of GATK.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Do you have the intermediate VCFs before/after each processing step? If so it would be good to test them to see at what stage the malformation issue happened.

  • KathKath Member

    Ah, I've had a look at the headers myself and it seems that SelectVariants adds the following lines to the start of the header (which are not hash tagged):

    INFO 08:20:18,462 HelpFormatter - --------------------------------------------------------------------------------
    INFO 08:20:18,464 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-2-g6bda569, Compiled 2013/08/28 16:30:29
    INFO 08:20:18,464 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 08:20:18,465 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 08:20:18,468 HelpFormatter - Program Args: -T SelectVariants -R /ifs/mirror/genomes/bwa/human_g1k_v37.fa --variant variants/processed.Trio8.haplotypeCaller.snpsift.vcf -select vc.getGenotype("processed.Trio8-AF023").getDP()>=10&&vc.getGenotype("processed.Trio8-AF024").getDP()>=10&&vc.getGenotype("processed.Trio8-AF022").getPL().0>20&&vc.getGenotype("processed.Trio8-AF022").getPL().1==0&&vc.getGenotype("processed.Trio8-AF022").getPL().2>0&&vc.getGenotype("processed.Trio8-AF023").getPL().0==0&&vc.getGenotype("processed.Trio8-AF023").getPL().1>20&&vc.getGenotype("processed.Trio8-AF023").getPL().2>20&&vc.getGenotype("processed.Trio8-AF024").getPL().0==0&&vc.getGenotype("processed.Trio8-AF024").getPL().1>20&&vc.getGenotype("processed.Trio8-AF024").getPL().2>20&&vc.getGenotype("processed.Trio8-AF022").getAD().1>=3
    INFO 08:20:18,468 HelpFormatter - Date/Time: 2013/11/06 08:20:18
    INFO 08:20:18,469 HelpFormatter - --------------------------------------------------------------------------------
    INFO 08:20:18,469 HelpFormatter - --------------------------------------------------------------------------------
    INFO 08:20:18,536 ArgumentTypeDescriptor - Dynamically determined type of variants/processed.Trio8.haplotypeCaller.snpsift.vcf to be VCF
    INFO 08:20:19,143 GenomeAnalysisEngine - Strictness is SILENT
    INFO 08:20:19,341 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO 08:20:19,390 RMDTrackBuilder - Loading Tribble index from disk for file variants/processed.Trio8.haplotypeCaller.snpsift.vcf
    INFO 08:20:19,663 GenomeAnalysisEngine - Preparing for traversal
    INFO 08:20:19,690 GenomeAnalysisEngine - Done preparing for traversal
    INFO 08:20:19,691 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 08:20:19,691 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining

    There were also a couple of lines in the midst of the vcf beginning:
    INFO 08:20:53,637 ProgressMeter etc.

    I deleted all these suspect lines and now VariantsToTable works.

    Best wishes,

    Kath

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Kath,

    Those "extra" lines are the log output. You're experiencing this issue because you're presumably writing all output to stdout and then piping to a file instead of using the -o argument to produce the output file.

  • KathKath Member

    Good point. So I am.
    Thanks!

  • liuxiiiliuxiii Member
    Hi,
    I am using gatk -T CombineVariants to combine some *.ann.vcf files got from snpEff, but got the following error:
    Please add an explicit type tag :NAME listing the correct type from among the supported types:

    Is there any --variant NAME available for the .ann.vcf files got from snpEff?

    Thanks,
    Xin
  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @liuxiii

    This seems to be an error from the snpEff files, not necessarily with the GATK tool.

    For example, the SnpEff can be run to list geneID instead of name. Please check how the parameters were set for SnpEff by looking at this document

    A couple of hints about what might have happened:

    SnpEff allows user defined intervals to be annotated. This is achieved using the -interval file.bed command line option, which can be used multiple times in the same command line (it accepts files in TXT, BED, BigBed, VCF, GFF formats). Any variant that intersects an interval defined in those files, will be annotated using the "name" field (fourth column) in the input bed file. 
    
    You can obtain gene IDs instead of gene names by using the command line option -geneId. Note: This is only for the old 'EFF' field ('ANN' field always shows both gene name and gene ID). 
    

    It is important to use VCF in a format that is compatible with the GATK tools.

    Please check this discussion here

Sign In or Register to comment.