Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Generate an index to VCF file

SakhaaSakhaa Member
edited August 13 in Ask the GATK team
Hello everybody,
I am trying to generate a .idx file to vcf by using the IndexFeatureFile tool:

"gatk IndexFeatureFile -F hapmap_3.3.b37.vcf -O hapmap_3.3.b37.vcf.idx"

and I got the following error:

"A USER ERROR has occurred: Cannot read hapmap_3.3.b37.vcf because no suitable codecs found"
May anyone help to solve that?
Post edited by Sakhaa on

Best Answer

Answers

  • SakhaaSakhaa Member
    Hi @bhanuGandham
    This is the header of log
    "gatk ValidateVariants -V hapmap_3.3.b37.vcf
    Using GATK jar /sw/csi/gatk/4.1.2.0/el7.5_binary_Py2.7env/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /sw/csi/gatk/4.1.2.0/el7.5_binary_Py2.7env/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar ValidateVariants -V hapmap_3.3.b37.vcf
    19:14:11.550 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/sw/csi/gatk/4.1.2.0/el7.5_binary_Py2.7env/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    "

    I have tried to use ValidateVariants tool
    "gatk ValidateVariants -V hapmap_3.3.b37.vcf"
    and I got the same error
    "A USER ERROR has occurred: Cannot read file:///ibex/scratch/alsaedsb/Task_w1/ref/hapmap_3.3.b37.vcf because no suitable codecs found"
  • SakhaaSakhaa Member
    @bhanuGandham
    and this is the list of reference files
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @Sakhaa
    Please post the header of hapmap_3.3.b37.vcf

  • SakhaaSakhaa Member
    @bhanuGandham
    I can not open the header of this file but in the following the header of (1000G_omni2.5.b37.vcf.gz) and it has the same problem:

    "##fileformat=VCFv4.1
    ##FILTER=<ID=PASS,Description="All filters passed">
    ##FILTER=<ID=NOT_POLY_IN_1000G,Description="Alternate allele count = 0">
    ##FILTER=<ID=badAssayMapping,Description="The mapping information for the SNP assay is internally inconsistent in the chip metadata">
    ##FILTER=<ID=dup,Description="Duplicate assay at same position with worse Gentrain Score">
    ##FILTER=<ID=id10,Description="Within 10 bp of an known indel">
    ##FILTER=<ID=id20,Description="Within 20 bp of an known indel">
    ##FILTER=<ID=id5,Description="Within 5 bp of an known indel">
    ##FILTER=<ID=id50,Description="Within 50 bp of an known indel">
    ##FILTER=<ID=refN,Description="Reference base is N. Assay is designed for 2 alt alleles">
    ##FORMAT=<ID=GC,Number=.,Type=Float,Description="Gencall Score">
    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
    ##FilterLiftedVariants="analysis_type=FilterLiftedVariants input_file=[] read_buffer_size=null phone_home=STANDARD read_filter=[] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL reference_sequence=/humgen/1kg/reference/human_g1k_v37.fasta rodBind=[] nonDeterministicRandomSeed=false downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 baq=OFF baqGapOpenPenalty=40.0 performanceLog=null useOriginalQualities=false defaultBaseQualities=-1 validation_strictness=SILENT unsafe=null num_threads=1 read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false disable_experimental_low_memory_sharding=false logging_level=INFO log_to_file=null help=false variant=(RodBinding name=variant source=./0.451323408008651.sorted.vcf) out=org.broadinstitute.sting.gatk.io.stubs.VCFWriterStub NO_HEADER=org.broadinstitute.sting.gatk.io.stubs.VCFWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VCFWriterStub filter_mismatching_base_and_quals=false"
    ##INFO=<ID=CR,Number=.,Type=Float,Description="SNP Callrate">
    ##INFO=<ID=GentrainScore,Number=.,Type=Float,Description="Gentrain Score">
    ##INFO=<ID=HW,Number=.,Type=Float,Description="Hardy-Weinberg Equilibrium">
    ##contig=<ID=1,length=249250621,assembly=b37>
    ##contig=<ID=10,length=135534747,assembly=b37>
    ##contig=<ID=11,length=135006516,assembly=b37>
    ##contig=<ID=12,length=133851895,assembly=b37>
    ##contig=<ID=13,length=115169878,assembly=b37>
    ##contig=<ID=14,length=107349540,assembly=b37>
    ##contig=<ID=15,length=102531392,assembly=b37>
    ##contig=<ID=16,length=90354753,assembly=b37>
    ##contig=<ID=17,length=81195210,assembly=b37>
    ##contig=<ID=18,length=78077248,assembly=b37>
    ##contig=<ID=19,length=59128983,assembly=b37>
    ##contig=<ID=2,length=243199373,assembly=b37>
    ##contig=<ID=20,length=63025520,assembly=b37>
    ##contig=<ID=21,length=48129895,assembly=b37>
    ##contig=<ID=22,length=51304566,assembly=b37>
    ##contig=<ID=3,length=198022430,assembly=b37>
    ##contig=<ID=4,length=191154276,assembly=b37>
    ##contig=<ID=5,length=180915260,assembly=b37>
    ##contig=<ID=6,length=171115067,assembly=b37>
    ##contig=<ID=7,length=159138663,assembly=b37>
    ##contig=<ID=8,length=146364022,assembly=b37>
    ##contig=<ID=9,length=141213431,assembly=b37>
    ##contig=<ID=GL000191.1,length=106433,assembly=b37>
    ##contig=<ID=GL000192.1,length=547496,assembly=b37>
    ##contig=<ID=GL000193.1,length=189789,assembly=b37>
    ##contig=<ID=GL000194.1,length=191469,assembly=b37>
    ##contig=<ID=GL000195.1,length=182896,assembly=b37>
    ##contig=<ID=GL000196.1,length=38914,assembly=b37>
    ##contig=<ID=GL000197.1,length=37175,assembly=b37>
    ##contig=<ID=GL000198.1,length=90085,assembly=b37>
    ##contig=<ID=GL000199.1,length=169874,assembly=b37>
    ##contig=<ID=GL000200.1,length=187035,assembly=b37>
    ##contig=<ID=GL000201.1,length=36148,assembly=b37>
    ##contig=<ID=GL000202.1,length=40103,assembly=b37>
    ##contig=<ID=GL000203.1,length=37498,assembly=b37>
    ##contig=<ID=GL000204.1,length=81310,assembly=b37>
    ##contig=<ID=GL000205.1,length=174588,assembly=b37>
    ##contig=<ID=GL000206.1,length=41001,assembly=b37>
    ##contig=<ID=GL000207.1,length=4262,assembly=b37>
    ##contig=<ID=GL000208.1,length=92689,assembly=b37>
    ##contig=<ID=GL000209.1,length=159169,assembly=b37>
    ##contig=<ID=GL000210.1,length=27682,assembly=b37>
    ##contig=<ID=GL000211.1,length=166566,assembly=b37>
    ##contig=<ID=GL000212.1,length=186858,assembly=b37>
    ##contig=<ID=GL000213.1,length=164239,assembly=b37>
    ##contig=<ID=GL000214.1,length=137718,assembly=b37>
    ##contig=<ID=GL000215.1,length=172545,assembly=b37>
    ##contig=<ID=GL000216.1,length=172294,assembly=b37>
    ##contig=<ID=GL000217.1,length=172149,assembly=b37>
    ##contig=<ID=GL000218.1,length=161147,assembly=b37>
    ##contig=<ID=GL000219.1,length=179198,assembly=b37>
    ##contig=<ID=GL000220.1,length=161802,assembly=b37>
    ##contig=<ID=GL000221.1,length=155397,assembly=b37>
    ##contig=<ID=GL000222.1,length=186861,assembly=b37>
    ##contig=<ID=GL000223.1,length=180455,assembly=b37>
    ##contig=<ID=GL000224.1,length=179693,assembly=b37>
    ##contig=<ID=GL000225.1,length=211173,assembly=b37>
    ##contig=<ID=GL000226.1,length=15008,assembly=b37>
    ##contig=<ID=GL000227.1,length=128374,assembly=b37>
    ##contig=<ID=GL000228.1,length=129120,assembly=b37>
    ##contig=<ID=GL000229.1,length=19913,assembly=b37>
    ##contig=<ID=GL000230.1,length=43691,assembly=b37>
    ##contig=<ID=GL000231.1,length=27386,assembly=b37>
    ##contig=<ID=GL000232.1,length=40652,assembly=b37>
    ##contig=<ID=GL000233.1,length=45941,assembly=b37>
    ##contig=<ID=GL000234.1,length=40531,assembly=b37>
    ##contig=<ID=GL000235.1,length=34474,assembly=b37>
    ##contig=<ID=GL000236.1,length=41934,assembly=b37>
    ##contig=<ID=GL000237.1,length=45867,assembly=b37>
    ##contig=<ID=GL000238.1,length=39939,assembly=b37>
    ##contig=<ID=GL000239.1,length=33824,assembly=b37>
    ##contig=<ID=GL000240.1,length=41933,assembly=b37>
    ##contig=<ID=GL000241.1,length=42152,assembly=b37>
    ##contig=<ID=GL000242.1,length=43523,assembly=b37>
    ##contig=<ID=GL000243.1,length=43341,assembly=b37>
    ##contig=<ID=GL000244.1,length=39929,assembly=b37>
    ##contig=<ID=GL000245.1,length=36651,assembly=b37>
    ##contig=<ID=GL000246.1,length=38154,assembly=b37>
    ##contig=<ID=GL000247.1,length=36422,assembly=b37>
    ##contig=<ID=GL000248.1,length=39786,assembly=b37>
    ##contig=<ID=GL000249.1,length=38502,assembly=b37>
    ##contig=<ID=MT,length=16569,assembly=b37>
    ##contig=<ID=X,length=155270560,assembly=b37>
    ##contig=<ID=Y,length=59373566,assembly=b37>
    ##reference=file:///humgen/1kg/reference/human_g1k_v37.fasta
    ##reference=human_b36_both.fasta
    ##source=infiniumFinalReportConverterV1.0
    ##bcftools_viewVersion=1.9+htslib-1.9
    ##bcftools_viewCommand=view -h 1000G_omni2.5.b37.vcf.gz; Date=Wed Aug 14 21:27:07 2019
    #CHROM POS ID REF ALT QUAL FILTER INFO"
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited August 15

    @Sakhaa

    1) What is the error you see when you try to read the header of the hapmap_3.3.b37.vcf file?
    2) Validate 1000G_omni2.5.b37.vcf.gz file using ValidateVariants
    3) Please post the exact command and entire error log while indexing 1000G_omni2.5.b37.vcf.gz

Sign In or Register to comment.