Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

How to get deletions?

sridhar28sridhar28 Posts: 13Member

Dear GATK Users,

Could anybody tell me how to identify the deletions from the bam file using GATK module?? Actually i used UnifiedGenotyper i am getting list like

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT human

gi|262 48155 . G A 80.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.103;DP=10;Dels=0.00;FS=0.000;HaplotypeScore =0.0000;MLEAC=1;MLEAF=0.500;MQ=28.61;MQ0=0;MQRankSum=-1.453;QD=8.08;ReadPosRankSum=-0.336 GT:AD:DP:GQ:PL 0/1:5,5:10:99:109,0,146

Thanks Sridhar

Answers

  • aeonsimaeonsim Posts: 45Member

    If using UnifiedGenotyper you need to make sure you add the option -glm BOTH to call both SNPs and Indels or -glm INDEL just to call indels. If your only working with a few samples you really should use the GATK HaplotypeCaller for calling Indels.

    In a VCF a 6 base Insertion looks something like:

    gi|262      48155     .     G       GACTGAT      80.77 . AC=1;...

    while a deletion looks like:

    gi|262      48155     .     GACACTGG      .    80.77 . AC=1;...

    To make them easier to spot you can add a variant Type value (SNP,INS,DEL etc) to the INFO line using SNPEFF's SnpSift tool. There may also be a tool to do that with GATK.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Yes, there is a variant type annotation; see the documentation for usage.

    Geraldine Van der Auwera, PhD

  • sridhar28sridhar28 Posts: 13Member

    Thank for the reply aeonism , Geraldine,

    is it necessary to use dbSNP.vcf file while running UnifiedGenotyper or Variant Annotator?? because i am getting error while using the dbSNP file which i downloaded from (ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/00-All.vcf.gz)

    ERROR MESSAGE: Input files /illumina/data/galaxy/reference/00-All.vcf and reference have incompatible contigs: No overlapping contigs found.
    ERROR /illumina/data/galaxy/reference/00-All.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT]
    ERROR reference contigs = [gi|262359905|ref|NG_005905.2|]

    Please suggest the same..

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    No, it's not necessary, but it can be informative.

    The error you are getting is because you used a version of dbsnp that is not compatible with the reference you are using. It looks like you are using a custom reference, not one of the standard human ones, is that correct?

    Geraldine Van der Auwera, PhD

  • sridhar28sridhar28 Posts: 13Member

    Hello Geraldine, Thanks for the reply. You are right i am using a custom reference. how to get a customised dbSNP file??

    Thanks Sridhar

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    The typical way is to make liftover chain files that map the correspondence between the standard reference and your custom one. This is not an easy process however, and we do not provide guidance on how to do it. If you want to do this, you will need to look at the liftover files we provide in our resource bundle and figure out how to do the equivalent for your reference. Good luck!

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.