We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

STR annotation

May I know how is this defined?

Best Answer

  • delangeldelangel Broad Institute ✭✭
    Accepted Answer

    If STR is present in the INFO field, then indel is a tandem repeat. If present, then annotation RU gives a string representing the bases that form the repeat unit, and RPA are the repeats per allele.
    For example, if a (CA)^3 becomes (CA)^4 (2 bp insertion), then STR is present, RU = CA, and RPA = 3,4.

Answers

  • delangeldelangel Broad InstituteMember ✭✭
    Accepted Answer

    If STR is present in the INFO field, then indel is a tandem repeat. If present, then annotation RU gives a string representing the bases that form the repeat unit, and RPA are the repeats per allele.
    For example, if a (CA)^3 becomes (CA)^4 (2 bp insertion), then STR is present, RU = CA, and RPA = 3,4.

  • ambarrioambarrio Member

    Dear @delangel,

    I wonder then, if I get a lot of negative values, is this because of STRs that are deletions to the reference genome?

    Best

  • xiuczxiucz Member

    How was the STR flagged in the INFO column? Please see the igvshot, and the mutect2 tumoronly
    output here.

    chrX    15349851    .   TA  T   .   PASS    CONTQ=93;DP=26;ECNT=1;GERMQ=18;MBQ=35,36;MFRL=256,222;MMQ=60,60;MPOS=40;POPAF=7.30;RPA=2,1;RU=A;SAAF=0.283,0.273,0.308;SAPP=0.017,0.026,0.957;STR;TLOD=22.22    GT:AD:AF:DP:F1R2:F2R1:OBAM:OBAMRC   0/1:18,8:0.321:26:9,3:9,5:false:false
    

    Thank you.

  • xiuczxiucz Member
    edited June 2019

    Any GATKers can give me some suggestions ?
    Thank you very much.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @xiucz

    Apologies for the delay in replying. In order to clarify, is your question "why the tool flagged this an STR"? Could you also provide a bam and bamout screenshots of this region to compare?

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Also please post the version of Mutect2 used and the exact command.

  • xiuczxiucz Member
    edited June 2019

    Hi, @bhanuGandham

    Thank you for your reply.

    Yes, my question is "why the tool flagged here an STR"? It seems that there is only one repeated allele.
    Screenshot and commands:

    ~/gatk-4.1.0.0/gatk Mutect2 \
    -R ~/database/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I ~/sample.recalibrated.bam -tumor samplename \
    -L ~/exon.bed --interval-padding 250 \
    --germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --tmp-dir ~/tmpdir -O sample.vcf \
    -bamout sample.recalibrated.mt2.bam
    
    ~/gatk-4.1.0.0/gatk GetPileupSummaries \
    -I ~/sample.recalibrated.bam \
    -L ~/exon.bed --interval-padding 250 \
    -V ~/small_exac_common_3_hg19.vcf.gz \
    -O sample.tumor.getpileupsummaries.table
    
    
    ~/gatktools/gatk-4.1.0.0/gatk CalculateContamination \
    -I sample.tumor.getpileupsummaries.table \
    -O sample.calculatecontamination.table
    
    
    ~/gatk-4.1.0.0/gatk FilterMutectCalls \
    -V sample.vcf \
    --contamination-table sample.calculatecontamination.table \
    -O sampple.mutect2_oncefilter.vcf
    

    Xiucz

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @xiucz

    This is what the Mutect2 dev said:
    Only the "slippage" filter (previously "str_contraction"), not the STR INFO, means that M2 rejects a variant. The STR INFO field applies for any indel of one or more units of a repeat, even if, as in this case, there are as few as two repeat units in the reference. M2 would never filter such a short "STR," however. Currently we don't filter for STRs smaller than 8 reference bases, and we don't necessarily filter larger ones. We are about to come out (within two months) with a much-improved polymerase slippage filter.

Sign In or Register to comment.