We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Invalid command line: No tribble type was provided on the command line and the type of the file coul

Dear GATK support team,

I have created g.vcf files from 2000 samples. Following GATK's best practices, I want to combine them in sets of 200 samples before proceeding with the joint genotyping. When doing this, some of my batches derived the following error:

INFO 11:02:17,217 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:02:17,220 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled 2014/10/24 01:07:22
INFO 11:02:17,220 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 11:02:17,220 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 11:02:17,226 HelpFormatter - Program Args: -T CombineGVCFs -R /groups/owncloud_ftdgc/data/simonj/files/Resources/hg19/ucsc.hg19.fasta --variant batch13.list -o /groups/owncloud_ftdgc/data/simonj/files/gVCF_files/RS_data/RS_batch13.g.vcf
INFO 11:02:17,233 HelpFormatter - Executing as [email protected] on Linux 2.6.32-358.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.7.0_75-mockbuild_2015_01_20_23_39-b00.
INFO 11:02:17,233 HelpFormatter - Date/Time: 2015/04/14 11:02:17
INFO 11:02:17,233 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:02:17,233 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:24:50,224 GATKRunReport - Uploaded run statistics report to AWS S3
srun: error: compute-01: task 0: Exited with exit code 1

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
ERROR Name FeatureType Documentation
ERROR BCF2 VariantContext (this is an external codec and is not documented within GATK)
ERROR VCF VariantContext (this is an external codec and is not documented within GATK)
ERROR VCF3 VariantContext (this is an external codec and is not documented within GATK)
ERROR ------------------------------------------------------------------------------------------

The command line I used was:

java -Xmx"$MEM"g -jar "$GATK" \
-T CombineGVCFs \
-R "$REFERENCE" \
--variant batch13.list \
-o "$OUT"

Thanks!

Best Answers

Answers

  • simonsanchezjsimonsanchezj GermanyMember

    Great, thank you both for your help!

  • HeidiLingHeidiLing Member ✭✭
    edited June 2016
  • simonsanchezjsimonsanchezj GermanyMember

    Hello,

    I am having this problem again . I am using GATKv3.6-0 and getting the following error:

    ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file 'Pipeline_test/gVCF_list.txt' could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
    ERROR Name FeatureType Documentation
    ERROR BCF2 VariantContext (this is an external codec and is not documented within GATK)
    ERROR VCF VariantContext (this is an external codec and is not documented within GATK)
    ERROR VCF3 VariantContext (this is an external codec and is not documented within GATK)
    ERROR ------------------------------------------------------------------------------------------

    'Pipeline_test/gVCF_list.txt' is a text file looking like this:
    Pipeline_test/01-102.raw.g.vcf
    Pipeline_test/01-164.raw.g.vcf
    Pipeline_test/04-219.raw.g.vcf
    Pipeline_test/05-301.raw.g.vcf
    Pipeline_test/06-209.raw.g.vcf
    Pipeline_test/99-007.raw.g.vcf

    The command I am trying to run is:

    module load java/1.8.0
    srun java -Xmx"$MEM"g -jar "$GATK" \
    -T GenotypeGVCFs \
    -nt "$PROC" \
    -R "$REFERENCE" \
    --variant "$GVCF" \
    --dbsnp "$DBSNP" \
    --disable_auto_index_creation_and_locking_when_reading_rods \
    --max_alternate_alleles "$MAX_ALTERNATE_ALLELES" \
    -o "$OUTPUT"All_samples_raw.snps.indels.vcf

    Thanks!

  • AngieAngie Member

    @Geraldine_VdAuwera said:
    @simonsanchezj You may be in luck as we recently added a feature to make that error message include the name of the file that is causing the problem. If you try running again using the latest nightly build (see Download page) you should see the name of the bad file included in the message. That will allow you to identify it more easily than by testing every one of your 2000 files!

    Once you have identified the problem file, try deleting the index file and running again. GATK will regenerate an index automatically. If you're lucky, the problem was just a corrupted index and this will fix it. If the file itself is bad you may need to regenerate the gvcf itself.

    Hi, i'm running into the same problem but gatk/3.7 doesn't give me the name of the corrupted file and I have 150 gvcf files. Is this feature disabled in this gate version?

  • AngieAngie Member

    sorry for the double posting. I forgot to mentioned that even deleting all the ind files and running the command line again it's still giving me the same error message. Does that mean that at least one of the gvcf files is corrupted? If the feature that @Geraldine_VdAuwera mentioned were active that would let me know which one it is. but I don't see any sample name in the output.

    here the command line:

    java -Xmx60g -jar /cluster/software/VERSIONS/gatk-3.7/GenomeAnalysisTK.jar \
    -T CombineGVCFs \
    -R $REF \
    --variant $INPUT/sample1.g.vcf \
    -o $OUTPUT/trial.g.vcf 2> $OUTPUT/trial.g.vcf.out

    many thanks for your help

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Hmm it should still work. Can you post the actual error message?
  • AngieAngie Member

    Many thanks Geraldine, but it was my mistake on the command, once I corrected it, the error message contained the name of the corrupted file.
    Do you mind if I ask a probably really silly question? it was a g.vcf file that was corrupted, what does that happens? the command I run contained several g.vcf files to be created and at the end only one was corrupted. Could it has something to be with the capacity of the nodes? I've seen that when the nodes are too busy problematic files are created but I don't know if that could be actually the problem.

    Thanks again! :)

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Ah, great. Usually when this happens it's a system blip, e.g. a copy operation gets interrupted or something gets written over part of something else.
Sign In or Register to comment.