The current GATK version is 3.2-2

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

# Bug in creation outputvcf file after Beagle

Posts: 53Member
edited October 2012

Hi,

I used Beagle to phase my data but for some indels, I have some probleme :

example :

Input vcf :

2       68599872        .       ATG     A       14.40   PASS    AC=1;AC1=1;AF=0.028


Input for beagle created by ProduceBeagleInput:

2:68599872 TG - 1.0000 0.0000 0.0000 ......


Output vcf created by BeagleOutputToVCF:

2       68599872        .       ATG     .       14.40   BGL_RM_WAS_-    AC1=1;AF1=0.02965.....


error message by CombineVariants:

MESSAGE: Badly formed variant context at location 68599872 in contig 2. Reference length must be at most one base shorter than location size


Can you help me?

Tipahine

Post edited by Geraldine_VdAuwera on
Tagged:

Have you validated the vcf file output by Beagle? If it fails you may need to contact the authors of Beagle -- if their tool is producing bad vcf files, we can't help with that. But if the problem is on our end we'll do what we can.

Geraldine Van der Auwera, PhD

• Posts: 678GATK Developer mod

Actually, this looks like it may be a bug in our code. We'll take a quick look and get back to you with some feedback.

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

Yeah, a shot of caffeine later I realized our BeagleOutputToVCF may be the culprit, since that's what's making the VCF. Sorry about that.

Geraldine Van der Auwera, PhD

• Posts: 678GATK Developer mod

Okay, it looks like Beagle claimed that your site was monomorphic so BeagleOutputToVCF is filtering your site and setting the ALT allele to "." (not polymorphic). This looks reasonable.
So the problem you are getting must be in CombineVariants. Are you using the latest version of the GATK? If so, what is your command-line? (And if not, please update to the latest version)

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Posts: 53Member

Sorry, it is not the lastest version but v2.0-35-g2d70733
my command line is :
java -jar $GATK_HOME/GenomeAnalysisTK.jar -R$RefGen -T BeagleOutputToVCF -V $VcfFile -beagleR2:BEAGLE$r2 -beaglePhased:BEAGLE $phase -beagleProbs:BEAGLE$probs -o $vcfBeagle -U LENIENT_VCF_PROCESSING -et NO_ET -K$GATK_KEY

• Posts: 678GATK Developer mod

Sorry, it's the Combine Variants command-line that we need.

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Posts: 53Member
edited November 2012

I use it in PBS script so before that, I define value for each variable

 java -jar $GATK_HOME/GenomeAnalysisTK.jar -R$RefGen -T CombineVariants -U LENIENT_VCF_PROCESSIN
G --out $outputFile -V:input1$input1 -V:input2 $input2 -V:input3$input3 -V:input4 $input4 -V:input5$input5 -V:input
6 $input6 -V:input7$input7 -V:input8 $input8 -V:input9$input9 -V:input10 $input10 -V:input11$input11 -V:input12 $in put12 -V:input13$input13 -V:input14 $input14 -V:input15$input15 -V:input16 $input16 -V:input17$input17 -V:input18 $input18 -V:input19$input19 -V:input20 $input20 -V:input21$input21 -V:input22 $input22 -V:inputX$inputX -genotypeMer
geOptions PRIORITIZE -priority input1,input2,input3,input4,input5,input6,input7,input8,input9,input10,input11,input12,
input13,input14,input15,input16,input17,input18,input19,input20,input21,input22,inputX -et NO_ET -K \$GATK_KEY

Post edited by Geraldine_VdAuwera on
• Posts: 678GATK Developer mod

I would try it with the latest version of the GATK. If it still fails, then I recommend trying to find a subset of the data with which you can replicate this error (i.e. just 2 of your input VCF files to CombineVariants) and then post the records at 68599872 here so we can help you figure out where the problem is.

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT