The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.

UnifiedGenotyper missed indels calling on 100 base pair reads

malwanamalwana Member Posts: 7

Hi, I'm using The Genome Analysis Toolkit (GATK) v2.3-9-ge5ebf34

I was running the UnifiedGenotyper tools on a 100bp paired end read with the the command below:

Program Args: -T UnifiedGenotyper -R ref.fasta -I realignedBwa.bam -glm BOTH -log variantsCalling.log -o bwa_variants.vcf

I noticed that only SNPs were called and not a single indel was called. however, when i used a 150bp paired end read with the similar command above it worked fine, the SNPs and Indels were called.

To make sure that the realignedBWA.bam file that i used was not corrupt/malformed, i used two different other pileup program on this bam file and the SNPs and Indels were called nicely. Hope you can take a look at this. I attached together the log file just in case

thanks in advance

log
log
variantsCalling.log
18K

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,704 admin

    Hi there,

    Can you tell me what happens if you set the -glm argument to INDEL?

    Geraldine Van der Auwera, PhD

  • malwanamalwana Member Posts: 7

    If the -glm argument is set up to INDEL, nothing is called up except for the 150bp which called up the indel

    while if i set up the argument to BOTH as well as SNP, only the SNPs were called.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,704 admin

    Hmm. Try using the --output_mode EMIT_ALL_SITES argument and see if the indels get called and with what quality scores.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,704 admin

    Oh, scratch that, that option doesn't work for indels.

    Have you tried calling it with the HaplotypeCaller? It is more advanced than the UG.

    Geraldine Van der Auwera, PhD

  • malwanamalwana Member Posts: 7

    I've been trying the HaplotypeCaller, and i'm not sure if it's my procedure/methods are wrong, but not a single SNPs or indel was called. the only output are the progress meter and no calling (for both 100bp and 150bp)

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,704 admin

    That's very odd. What kind of dataset are you running on?

    Geraldine Van der Auwera, PhD

  • malwanamalwana Member Posts: 7

    I simulated mutation read from a reference genome using DWGSIM, which produced reads (used the parameter to generate the read as if coming from illumina) which I align using several aligner including BWA..

  • malwanamalwana Member Posts: 7

    Sorry for the long delay, it's holiday here.. I'll take a quick look and reply back, thanks a lot for helping out

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,704 admin

    No worries, enjoy your holiday!

    Geraldine Van der Auwera, PhD

  • malwanamalwana Member Posts: 7

    The base qualities are good enough, higher than the default minimum of 17 needed by the UG. A check on BQSR.log showed only 253 reads were filtered out from 20257 (1.25%) and samtools flagstat showed a 100% QC-passed reads.

    Should I try to change any parameters involving the minimum base qualities because as for now i'm using the default.

  • malwanamalwana Member Posts: 7

    ok, my bad.. it turns out the DP for indels is lower than the minimum thus making the indels being ommited.

    Thanks for helping out pdexheimer and Geraldine.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,704 admin

    Glad to hear your problem is solved.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.