Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Standardize the INFO with VariantAnnotator for a merged VCF file

I have merged data (VCF files with vcf-merge of VCFtools ) from different projects and selected only the lines of one region. So they are not producted by the same pipeline and they don't have the INFO.
I would like to standardize the merged file with your tool VariantAnnotator but I have this error message :

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.2-9-g54ae978):
ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
ERROR Please do not post this error to the GATK forum
ERROR
ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Error parsing line: 4243534, for input source: /scratch/cbrc/B10/results/extract/B10_myregion_all.vcf
ERROR ------------------------------------------------------------------------------------------

But my file has only 2075 lines.
can your tool standardize the merged VCF files from different projects?
Can you help me?

Thanks,

Tiphaine

Best Answer

Answers

  • TiphaineTiphaine Member
    edited December 2012

    It works but if I use CombineVariants and if in some VCF files, the id of DBsnp has existed yet, I have n times the same id (n : number of VCF files hving id of DBsnp). So Could you add a step in your algorithm which delete the duplicate of id ?
    it will be a more.

    example :

    3 47021780 rs59321380;rs59321380 G T 115 PASS ABHom=0.996;AC=2;AC1=2......

    3 47021837 rs62246379;rs62246379,rs62246379 G A 999 PASS ABHet=....

  • ebanksebanks Broad InstituteMember, Broadie, Dev ✭✭✭✭

    I can't reproduce that error locally with the latest version of the code...

  • TiphaineTiphaine Member

    argh, it is true, i don't have the last version but 2..2.9.
    It is not easy to change every week, the version of your software when we analyse the data

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Tiphaine, we realize this is inconvenient but unfortunately that is an inescapable downside of cutting edge research software that is under active development... But we are looking at ways to make the updating process a little easier.

  • TiphaineTiphaine Member

    I understand it, thanks

Sign In or Register to comment.