It looks like you're new here. If you want to get involved, click one of these buttons!
Hi,
I used Beagle to phase my data but for some indels, I have some probleme :
example :
Input vcf :
2 68599872 . ATG A 14.40 PASS AC=1;AC1=1;AF=0.028
Input for beagle created by ProduceBeagleInput:
2:68599872 TG - 1.0000 0.0000 0.0000 ......
Output vcf created by BeagleOutputToVCF:
2 68599872 . ATG . 14.40 BGL_RM_WAS_- AC1=1;AF1=0.02965.....
error message by CombineVariants:
MESSAGE: Badly formed variant context at location 68599872 in contig 2. Reference length must be at most one base shorter than location size
Can you help me?
Tipahine
Answers
Have you validated the vcf file output by Beagle? If it fails you may need to contact the authors of Beagle -- if their tool is producing bad vcf files, we can't help with that. But if the problem is on our end we'll do what we can.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Actually, this looks like it may be a bug in our code. We'll take a quick look and get back to you with some feedback.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Yeah, a shot of caffeine later I realized our BeagleOutputToVCF may be the culprit, since that's what's making the VCF. Sorry about that.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Okay, it looks like Beagle claimed that your site was monomorphic so BeagleOutputToVCF is filtering your site and setting the ALT allele to "." (not polymorphic). This looks reasonable. So the problem you are getting must be in CombineVariants. Are you using the latest version of the GATK? If so, what is your command-line? (And if not, please update to the latest version)
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Sorry, it is not the lastest version but v2.0-35-g2d70733 my command line is : java -jar $GATK_HOME/GenomeAnalysisTK.jar -R $RefGen -T BeagleOutputToVCF -V $VcfFile -beagleR2:BEAGLE $r2 -beaglePhased:BEAGLE $phase -beagleProbs:BEAGLE $probs -o $vcfBeagle -U LENIENT_VCF_PROCESSING -et NO_ET -K $GATK_KEY
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Sorry, it's the Combine Variants command-line that we need.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •I use it in PBS script so before that, I define value for each variable
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •I would try it with the latest version of the GATK. If it still fails, then I recommend trying to find a subset of the data with which you can replicate this error (i.e. just 2 of your input VCF files to CombineVariants) and then post the records at 68599872 here so we can help you figure out where the problem is.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •