The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# Indel Calling

Member Posts: 6

I have Ion Torrent data. I am trying to call a variant that I know to exist (confirmed with Sanger). In the position where there is the known indel, I have a depth of roughly 80-90 (in two different runs) and a of those between 20-23% of the reads have the insertion called. What parameters should I be adjusting to get this indel to call? I don't mind a large number of false positives.

I've tried several iterations that include indel realignment using known indels (1000G_phase1 and ills_and_1000G_gold_standard) and also excluding them. I have also tried iterations of setting these flags in UnifiedGenotyper:-stand_call_conf 30.0 -stand_emit_conf 0.0 --min_base_quality_score 0 -glm BOTH --dbsnp dbsnp_137.b37.vcf -nt -rf BadCigar -minIndelCnt 3 -minIndelFrac 0.15. I have also attempted to use HaplotypeCaller: -stand_call_conf 30.0 -stand_emit_conf 0.0 --dbsnp dbsnp_137.b37.vcf -rf BadCigar

Any suggestions would be great.

Tagged:

Hi there,

Is your indel quite big? If so you may need to use HaplotypeCaller and override the default ActiveRegion size to increase the callable size.

You can also try running in GENOTYPE_GIVEN_ALLELES mode to force a call.

Geraldine Van der Auwera, PhD

• Member Posts: 6

Thank you Geraldine. I will give this a try. The indel is just a dupT.

• Member Posts: 6

Geraldine,

I used these parameters for UnifiedGenotyper and received no variants and no input indel vcfs for the indel realignment step.

-stand_call_conf 30.0 -stand_emit_conf 0.0 --min_base_quality_score 0 -glm BOTH --dbsnp $dbSNPRef -nt$numThreads -rf BadCigar -minIndelCnt 3 -minIndelFrac 0.15 --genotyping_mode GENOTYPE_GIVEN_ALLELES

I'm obviously missing something. It appears that I should have included the --alleles flag. What should I be passing to the --alleles flag? I don't quite understand RodBinding[VariantContext]. Could I pass a bed file for regions which I want to force a call, so that I can then go back and see what kind of quality scores prevented the dupT from being called in the first place?

Thank you.

Ah yes, the point of GGA mode is that you provide known variant sites with specific ALT alleles that you are interested in, and ask the GATK to evaluate whether they are present in your samples. To do this you pass in a VCF containing the sites/alleles of interest with the --alleles argument, and typically you also pass the same VCF in via the -L argument to restrict calling to those sites (otherwise GATK will try to call the rest of the genome as well in normal discovery mode).

This may still not produce the call you want; if so you can use the experimental reference likelihoods in the UG, or the HaplotypeCaller's reference confidence model to get an idea of what the GATK thinks is going on at those sites.

Geraldine Van der Auwera, PhD