We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Choice of known_indels.vcf on google cloud bucket

Dear all:

After remapping whole genome sequencing data to GRCh38 reference assembly, I would like to do the local realignment around indels. I am wondering which known indels file to use. I saw this one on the google cloud bucket: Homo_sapiens_assembly38.known_indels.vcf.gz. Is this the 1000 genome phase 3 indels? Can I use it for local realignment?

I noticed that the best practice pipeline doesn't have a realignment step. I am using gatk version 3.3, so I would like to do local realignment.

many thanks


  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @zhangyd10

    We do not support version 3.3 anymore. Please use the latest GATK4.1.1.0.

  • zhangyd10zhangyd10 Member
    Dear bhanuGandham:

    How to use gatk4.1.1.0 to do local realignment? Is there a tutorial?

    Many thanks,
    Yidong Zhang
  • zhangyd10zhangyd10 Member
    Dear @bhanuGandham

    I am following a pipeline in an article to realign WGS reads to GRCh38 in a ALTs aware manner, so I still want to use gatk version 3. The question I actually want to ask is: is it OK to use Homo_sapiens_assembly38.known_indels.vcf.gz on the google cloud bucket for local realignment with "IndelRealigner" of gatk3. Since it seems that gatk4 no longer do realignment.

    Many thanks
    Yidong Zhang
  • bshifawbshifaw Member, Broadie, Moderator admin

    No there isn't a tutorial on local realignment in gatk4.1.1.0 but we do have some gatk3 docs on the tool here and here. I'm not sure if the file was specifically designed for local realignment but judging by its name I would assume its contains known indel variants which is what you'll need to run the tool.

Sign In or Register to comment.