Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

VariantsToTable

apallav2apallav2 Posts: 7Member
edited September 2013 in Ask the GATK team

Hi,
I have a vcf file that I have annotated dbsnp id and cosmic id in the ID field (GATK -> vcf ->VariantAnnotator to append COSMIC ids in to the id field already populated with dbsnpid.I want this way for whatever reason.)
When I use such vcf (with appended ids) as an --variant argumet with VariantToTable - first of all it complains about Tribble not beeing supplied - so I would tweak in the command as --variant:vcf,<input.vcf> it works but empty output.

When I supply regular vcf spitted out by GATK,it runs fanstastic. Can somebody change this behaviour? or do I get a source code that I can tweak to get this going for the vcf format I want? Thx.

with regular vcf file

$ java -jar  GenomeAnalysisTK-2.5-2-gf57256b/GenomeAnalysisTK.jar -R  hg19.masked.fasta -T VariantsToTable --variant input.sorted.vcf -o table  -F CHROM -F POS -F REF -F ALT -F ID -F QUAL -F MQ -F DP -F AF -F AD -AMD
INFO 16:46:44,633 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:46:44,635 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
INFO 16:46:44,635 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 16:46:44,635 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 16:46:44,637 HelpFormatter - Program Args: -R -R hg19.masked.fasta -T VariantsToTable --variant input.sorted.vcf -o table -F CHROM -F POS -F REF -F ALT -F ID -F QUAL -F MQ -F DP -F AF -F AD -AMD
INFO 16:46:44,637 HelpFormatter - Date/Time: 2013/08/29 16:46:44
INFO 16:46:44,637 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:46:44,637 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:46:44,642 ArgumentTypeDescriptor - Dynamically determined type of input.sorted.vcf to be VCF
INFO 16:46:44,672 GenomeAnalysisEngine - Strictness is SILENT
INFO 16:46:44,713 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 16:46:44,727 RMDTrackBuilder - Loading Tribble index from disk for file input.sorted.vcf
INFO 16:46:44,861 GenomeAnalysisEngine - Creating shard strategy for 0 BAM files
INFO 16:46:44,870 GenomeAnalysisEngine - Done creating shard strategy
INFO 16:46:44,870 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 16:46:44,870 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 16:46:53,708 ProgressMeter - done 3.46e+05 8.0 s 25.0 s 98.7% 8.0 s 0.0 s
INFO 16:46:53,709 ProgressMeter - Total runtime 8.84 secs, 0.15 min, 0.00 hours
INFO 16:46:54,228 GATKRunReport - Uploaded run statistics report to AWS S3

$ wc -l table
346436 table

$ wc -l input.sorted.vcf
346491 input.sorted.vcf

~~~~~~~~~~

With altered vcf:

$ java -jar GenomeAnalysisTK.jar -R  hg19.masked.fasta -T VariantsToTable --variant input.sorted.dbsnp-cosmic.vcf -o table  -F CHROM -F POS -F REF -F ALT -F ID -F QUAL -F MQ -F DP -F AF -F AD -AMD

INFO 17:03:55,412 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:03:55,413 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
INFO 17:03:55,413 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 17:03:55,413 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 17:03:55,416 HelpFormatter - Program Args: -R hg19.masked.fasta -T VariantsToTable --variant input.sorted.dbsnp-cosmic.vcf -o table -F CHROM -F POS -F REF -F ALT -F ID -F QUAL -F MQ -F DP -F AF -F AD -AMD
INFO 17:03:55,416 HelpFormatter - Date/Time: 2013/08/29 17:03:55
INFO 17:03:55,416 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:03:55,416 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:03:56,114 GATKRunReport - Uploaded run statistics report to AWS S3
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.5-2-gf57256b):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
##### ERROR Name FeatureType Documentation
##### ERROR BCF2 VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_variant_bcf2_BCF2Codec.html
##### ERROR VCF VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_variant_vcf_VCFCodec.html
##### ERROR VCF3 VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_variant_vcf_VCF3Codec.html
##### ERROR ------------------------------------------------------------------------------------------

$ java -jar GenomeAnalysisTK.jar -R hg19.masked.fasta -T VariantsToTable --variant:vcf,input.sorted.dbsnp-cosmic.vcf -o table -F CHROM -F POS -F REF -F ALT -F ID -F QUAL -F MQ -F DP -F AF -F AD -AMD
INFO 17:04:19,779 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:04:19,781 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
INFO 17:04:19,781 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 17:04:19,781 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 17:04:19,783 HelpFormatter - Program Args: -R hg19.masked.fasta -T VariantsToTable --variant:vcf,input.sorted.dbsnp-cosmic.vcf -o table -F CHROM -F POS -F REF -F ALT -F ID -F QUAL -F MQ -F DP -F AF -F AD -AMD
INFO 17:04:19,783 HelpFormatter - Date/Time: 2013/08/29 17:04:19
INFO 17:04:19,784 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:04:19,784 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:04:19,816 GenomeAnalysisEngine - Strictness is SILENT
INFO 17:04:19,858 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 17:04:19,898 GenomeAnalysisEngine - Creating shard strategy for 0 BAM files
INFO 17:04:19,908 GenomeAnalysisEngine - Done creating shard strategy
INFO 17:04:19,908 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 17:04:19,908 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 17:04:20,101 ProgressMeter - done 0.00e+00 0.0 s 53.3 h 100.0% 0.0 s 0.0 s
INFO 17:04:20,101 ProgressMeter - Total runtime 0.19 secs, 0.00 min, 0.00 hours
INFO 17:04:20,550 GATKRunReport - Uploaded run statistics report to AWS S3

$ wc -l table
1 table

$ cat table
CHROM POS REF ALT ID QUAL MQ DP AF AD
Post edited by Geraldine_VdAuwera on
Tagged:

Best Answer

Answers

Sign In or Register to comment.