Detecting indels at sites of genome-editing by CRISPER/Cas9

HobbsHobbs Saint Louis, MOMember

Hello,

Thank you for this great resource you have made available to the scientific community.

Background: We created targeted mutations in a human cell line using CRISPR/Cas9. We PCR amplified across the target site and sequenced the PCR product using ionTorrent. TMAP was used for alignment to hg19 and Integrated Genome Browser was used for viewing the alignments graphically. We can visualize deletions less than 30 bases but published literature indicates that we should see deletions and insertions around the cut site up to 150 bases (or even more). Some investigators have used a combination of bwa-mem with GATK for this purpose but they did not provide details of the analyses.

Clarification/query: Having have gone through the GATK tutorial videos (from March 2015) and perused GATK Best Practices, it appears that HC is the best tool for detecting larger indels. What parameters do I need to tweak in HC for this purpose.

I am surmising that I would also have to use the correct parameters in bwa-mem so that such deletions are in the output bam file for HC to be able to 'see' them.

I don't see a recommended pipeline for this type of analysis on the GATK website.

I also looked through all the 39 pages of Forum Pages but did not see this particular question being asked.

Thank you!

Cheers

Comments

  • TechnicalVaultTechnicalVault Cambridge, UKMember ✭✭✭
    edited September 2015

    I wonder if what you're seeing here is an interaction between the change in BWA MEM's way of mapping longer INDELs and GATK. From what I understand if the INDEL exceeds a certain size you'll see a supplementary read used to represent the INDEL rather than putting the INDEL in the CIGAR. If you're looking for these you'd see a wall of soft clipped primary reads and hard clipped supplementaries, on either side of the INDEL.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Hobbs
    Hi,

    Have you tried using BWA and following our Best Practices? Haplotype Caller should be able to call the larger indels without you having to tweak the parameters.

    -Sheila

  • HobbsHobbs Saint Louis, MOMember

    Hi,
    I was confused by this bit on the Best Practices Page: "The GATK Best Practices workflows provide step-by-step recommendations for performing variant discovery analysis in high-throughput sequencing (HTS) data. They enable discovery of SNPs and small indels (no size limit in theory but adjustments may be required to call indels > 50 bp) in DNA and RNAseq."

    I was wondering what those adjustments were.

    I am so very new to GATK and the command line, it is going to be a while before I can answer your question about the results after using BWA with GATK Best Practices. My current plan is to use bwa-mem for alignment and then mark duplicates using Picard (v 1.136) on the Galaxy website followed by Variant Calling using HC using GATK (initial testing reveals that the installation on my Mac works!). If everything goes well, I hope to answer your question shortly!

    Cheers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Hobbs
    Hi,

    Geraldine just reminded me that increasing the active region size will help you call longer indels. https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_haplotypecaller_HaplotypeCaller.php#--activeRegionMaxSize

    There may be other tweaks, but changing the active region size will be a good start.

    Let us know how things are going, and Good Luck!

    -Sheila

  • HobbsHobbs Saint Louis, MOMember

    Thanks very much! I will try that.

Sign In or Register to comment.