Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

[ERROR] SelectVariants

cbaocbao Member, Broadie ✭✭

Hi,

I want to run SelectVariants on FC using my own vcf. But, I got the following ERROR. Any suggestions are welcome. Thanks!

Best,
Chunyang

22:08:56.550 INFO SelectVariants - Initializing engine 22:08:57.487 INFO FeatureManager - Using codec VCFCodec to read file file:///cromwell_root/fc-b9cc3b48-69c9-4013-8eba-ff7973fc7bc9/gnomad.genomes.r2.0.2.sites.vcf.gz 22:08:58.200 INFO SelectVariants - Shutting down engine [February 4, 2018 10:08:58 PM UTC] org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants done. Elapsed time: 2.55 minutes. Runtime.totalMemory()=74297344 htsjdk.tribble.TribbleException: Contig 1 does not have a length field. at htsjdk.variant.vcf.VCFContigHeaderLine.getSAMSequenceRecord(VCFContigHeaderLine.java:80) at htsjdk.variant.vcf.VCFHeader.getSequenceDictionary(VCFHeader.java:206) at org.broadinstitute.hellbender.engine.FeatureDataSource.getSequenceDictionary(FeatureDataSource.java:400) at org.broadinstitute.hellbender.engine.FeatureManager.lambda$getAllSequenceDictionaries$5(FeatureManager.java:282) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.broadinstitute.hellbender.engine.FeatureManager.getAllSequenceDictionaries(FeatureManager.java:284) at org.broadinstitute.hellbender.engine.GATKTool.validateSequenceDictionaries(GATKTool.java:588) at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:563) at org.broadinstitute.hellbender.engine.VariantWalker.onStartup(VariantWalker.java:43) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:134) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:153) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195) at org.broadinstitute.hellbender.Main.main(Main.java:277) Using GATK jar /root/gatk.jar defined in environment variable GATK_LOCAL_JAR Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Xmx2g -jar /root/gatk.jar SelectVariants -V /cromwell_root/fc-b9cc3b48-69c9-4013-8eba-ff7973fc7bc9/gnomad.genomes.r2.0.2.sites.vcf.gz -O selected.vcf --lenient

Tagged:

Best Answers

  • cbaocbao ✭✭
    Accepted Answer

    Thank you! I fixed the issue by myself. My VCF header is not correct.

  • cbaocbao ✭✭
    Accepted Answer

    Hi, @Sheila,

    My VCF header is not in a "correct" format because there is no length field in the line start with "##contig". So, I just use awk to add contig lenght to my VCF header. Moreover, I think it also works if I use "-R ref.fasta" in a GATK cmd. Thanks a lot!

    contig_length=cat Homo_sapiens_assembly19.fasta.fai | \ grep ^[0-9X] | awk '{printf("##contig=<ID=%s,length=%d>\n",$1,$2);}'

    cat my.vcf | \
    grep '#' | grep -v '^##contig=' | \
    awk -v c="${contig_length}" '/^#CHROM/ {printf c"\n"} {print}' > new.vcf`

    More detials: https://www.biostars.org/p/198660/

    Thanks,
    Chunyang

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @cbao
    Hi,

    I think this is a VCF index issue. Can you try deleting the index and regenerating it? This thread may help too.

    -Sheila

  • cbaocbao Member, Broadie ✭✭

    Thank you! I will try it out.

  • cbaocbao Member, Broadie ✭✭

    I tried. But, it does work ... I got the same ERROR, "Contig 1 does not have a length field."

    My cmd:
    gatk --java-options "-Xmx${command_mem}g" SelectVariants -V ${input_vcf} -O ${output_name}.vcf.gz

    input_vcf is an uncompressed plain text.

  • cbaocbao Member, Broadie ✭✭
    Accepted Answer

    Thank you! I fixed the issue by myself. My VCF header is not correct.

  • cbaocbao Member, Broadie ✭✭
    Accepted Answer

    Hi, @Sheila,

    My VCF header is not in a "correct" format because there is no length field in the line start with "##contig". So, I just use awk to add contig lenght to my VCF header. Moreover, I think it also works if I use "-R ref.fasta" in a GATK cmd. Thanks a lot!

    contig_length=cat Homo_sapiens_assembly19.fasta.fai | \ grep ^[0-9X] | awk '{printf("##contig=<ID=%s,length=%d>\n",$1,$2);}'

    cat my.vcf | \
    grep '#' | grep -v '^##contig=' | \
    awk -v c="${contig_length}" '/^#CHROM/ {printf c"\n"} {print}' > new.vcf`

    More detials: https://www.biostars.org/p/198660/

    Thanks,
    Chunyang

Sign In or Register to comment.