Problem for CombineGVCFs

ArvinWuArvinWu Las Vegas, NVMember


Now I am trying to combine two gVCFs. However I got the error message.

INFO 11:23:01,758 GenomeAnalysisEngine - Strictness is SILENT
INFO 11:23:01,940 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 11:23:02,319 GenomeAnalysisEngine - Preparing for traversal
INFO 11:23:02,332 GenomeAnalysisEngine - Done preparing for traversal
INFO 11:23:02,332 ProgressMeter - | processed | time | per 1M | | total | remai ning
INFO 11:23:02,333 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | run time
WARN 11:23:02,482 StrandBiasTest - No StrandBiasBySample annotation or read data was found. Strand bias annotatio ns will not be output.
WARN 11:23:02,483 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrel ated samples.
WARN 11:23:02,484 StrandBiasTest - No StrandBiasBySample annotation or read data was found. Strand bias annotatio ns will not be output.
INFO 11:23:13,500 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalStateException: Key END found in VariantContext field INFO at 1:12346047 but this key isn't defin ed in the VCFHeader. We require all VCFs to have complete VCF headers by default.
at htsjdk.variant.vcf.VCFEncoder.fieldIsMissingFromHeaderError(
at htsjdk.variant.vcf.VCFEncoder.encode(
at htsjdk.variant.variantcontext.writer.VCFWriter.add(
at ava:200)
at )

I checks the files for many times.
I am pretty sure that the two gVCFs doesn't contains the Key "END" in the files.
Why is it happened? How can I solve it?

Best Answers


  • ArvinWuArvinWu Las Vegas, NVMember

    The gVCF files that I want to combine are produced via the GATK tool HaplotypeCaller.
    Since I only have the vcf file, however, it can not be combined via CombineGVCFs of GATK.
    Hence, I first transform the vcf to bam via SimulateReadsForVariants of GATK.
    And then, I use HaplotypeCaller of GATK to produce the gVCF files.
    However, when I try to combine the gVCFs, the tools give that error.
    What can I do now?

  • ArvinWuArvinWu Las Vegas, NVMember

    The GATK version is 3.5.0.
    I want to combine all the VCF files, however, the VCF files provided by the company can't be combined via GATK tool CombineGVCFs.
    Hence, I transfer it back to bam file again in order to create the gVCF.
    Adding ##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval"> to the gVCF files works.

    However, I still have question. I have checked all the combined files and I didn't found any terms "END" in it.
    Why the GATK tools still requires that?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    @ArvinWu your workflow is very confusing and I think it's probably wrong. What are you trying to achieve?
  • ArvinWuArvinWu Las Vegas, NVMember

    Since I saw the webpage

    and it shows to merge vcfs using CombineGVCFs, I run the CombineGVCFs. And it shows error when combining the vcfs.
    Hence, I transform the vcf back to bam file and reproduce the gVCF.
    Am I doing wrong? How can I do to combine all the VCF files?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    No, I mean what is the scientific objective you aim to achieve? What is your experiment design? And what data are you working with? Without this information we cannot provide adequate guidance.

  • ArvinWuArvinWu Las Vegas, NVMember

    I am working on human's data.
    I got the vcf files which contains multiple families from the company.
    I want to find SNPs within one family that has two parents and two sick children.
    Also if there are the same location SNPs from several different families, that will be great.
    The final goal is to combine all the sick people vcf files and normal people vcf files to see if any SNP can be found.
    I am confused which tools I can used. Hence, when I saw the webpage above, I follows the steps to handle with my data.

  • ArvinWuArvinWu Las Vegas, NVMember

    Thanks a lot for your suggestion. :)

Sign In or Register to comment.