The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Bug in creation outputvcf file after Beagle

TiphaineTiphaine Member
edited October 2012 in Ask the GATK team

Hi,

I used Beagle to phase my data but for some indels, I have some probleme :

example :

Input vcf :

2       68599872        .       ATG     A       14.40   PASS    AC=1;AC1=1;AF=0.028

Input for beagle created by ProduceBeagleInput:

2:68599872 TG - 1.0000 0.0000 0.0000 ......

Output vcf created by BeagleOutputToVCF:

2       68599872        .       ATG     .       14.40   BGL_RM_WAS_-    AC1=1;AF1=0.02965.....

error message by CombineVariants:

MESSAGE: Badly formed variant context at location 68599872 in contig 2. Reference length must be at most one base shorter than location size

Can you help me?

Tipahine

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Have you validated the vcf file output by Beagle? If it fails you may need to contact the authors of Beagle -- if their tool is producing bad vcf files, we can't help with that. But if the problem is on our end we'll do what we can.

  • ebanksebanks Broad InstituteMember, Broadie, Dev

    Actually, this looks like it may be a bug in our code. We'll take a quick look and get back to you with some feedback.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Yeah, a shot of caffeine later I realized our BeagleOutputToVCF may be the culprit, since that's what's making the VCF. Sorry about that.

  • ebanksebanks Broad InstituteMember, Broadie, Dev

    Okay, it looks like Beagle claimed that your site was monomorphic so BeagleOutputToVCF is filtering your site and setting the ALT allele to "." (not polymorphic). This looks reasonable.
    So the problem you are getting must be in CombineVariants. Are you using the latest version of the GATK? If so, what is your command-line? (And if not, please update to the latest version)

  • Sorry, it is not the lastest version but v2.0-35-g2d70733
    my command line is :
    java -jar $GATK_HOME/GenomeAnalysisTK.jar -R $RefGen -T BeagleOutputToVCF -V $VcfFile -beagleR2:BEAGLE $r2 -beaglePhased:BEAGLE $phase -beagleProbs:BEAGLE $probs -o $vcfBeagle -U LENIENT_VCF_PROCESSING -et NO_ET -K $GATK_KEY

  • ebanksebanks Broad InstituteMember, Broadie, Dev

    Sorry, it's the Combine Variants command-line that we need.

  • TiphaineTiphaine Member
    edited November 2012

    I use it in PBS script so before that, I define value for each variable

     java -jar $GATK_HOME/GenomeAnalysisTK.jar -R $RefGen -T CombineVariants -U LENIENT_VCF_PROCESSIN
     G --out $outputFile -V:input1 $input1 -V:input2 $input2 -V:input3 $input3 -V:input4 $input4 -V:input5 $input5 -V:input
     6 $input6 -V:input7 $input7 -V:input8 $input8 -V:input9 $input9 -V:input10 $input10 -V:input11 $input11 -V:input12 $in
     put12 -V:input13 $input13 -V:input14 $input14 -V:input15 $input15 -V:input16 $input16 -V:input17 $input17 -V:input18 $
     input18 -V:input19 $input19 -V:input20 $input20 -V:input21 $input21 -V:input22 $input22 -V:inputX $inputX -genotypeMer
     geOptions PRIORITIZE -priority input1,input2,input3,input4,input5,input6,input7,input8,input9,input10,input11,input12,
     input13,input14,input15,input16,input17,input18,input19,input20,input21,input22,inputX -et NO_ET -K $GATK_KEY
    
    Post edited by Geraldine_VdAuwera on
  • ebanksebanks Broad InstituteMember, Broadie, Dev

    I would try it with the latest version of the GATK. If it still fails, then I recommend trying to find a subset of the data with which you can replicate this error (i.e. just 2 of your input VCF files to CombineVariants) and then post the records at 68599872 here so we can help you figure out where the problem is.

Sign In or Register to comment.