Error in GATK4 MUTECT2

DioraDiora Member
edited February 6 in Ask the GATK team

Hello
I am using RNASeq somatic mutations calls, using GATK
I got this error in somatic mutation using GATK4 Mutect::

 BAM header sample names [S1]does not contain given tumor sample name S1

This is how I assigned read groups and sample names::

 S1.dedupped.bam:  
 java -Dlog4j.configurationFile="log4j2.xml" -jar ${PICARD}/picard.jar AddOrReplaceReadGroups 
 I=${WHERE}/Aligned.out.sam O=${WHERE}/rg_added_sorted.bam \
 SO=coordinate RGID="@E00461_116_EM170602261_1" RGLB=S1_S13 RGPL=ILLUMINA 
 RGPU="@E00461_116_EM170602261_1.S1_S13" RGSM=S1

And this is the command that throw the error::

   S1.vcf.gz: S1.bam
   ${GATK4}/gatk Mutect2 \
   -R ${hg38}.fasta \
    -I S1.bam \
    -tumor S1 \
    -O S1_pon.vcf.gz

However 'S1' doesn't show in my S1.bam

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Diora
    Hi,

    Can you post the BAM header that shows the @RG lines?

    Thanks,
    Sheila

  • E00461:116:GW170602261:1:2111:12713:2276 163 chr20 80381 60 151M = 80387 157 CCTTGATCCTGAATCAACAGACCACTTGCAGATATACTTCACAGCCCACGCTGACTCTGCCAAGCACAGACAACCACTGGGCCCCAGGGGAGCTGCAGGTCTCCTGGTCACCTAATCTTTTTTTTTTTTATACTTTAAGTTTTTGGGTACA @A;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;6;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A;B;=;;;;;;=B;;;;;;;;;;;;;;;;;;;<;;;;;;;;;;=;=;BA;A=;;;;;.-B;?=A. MC:Z:151M BD:Z:OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO PG:Z:MarkDuplicates RG:Z:@E00461_116_GW170602261_1 NH:i:1 BI:Z:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMLMMMMMMMMMMMMMLMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMLMMMMMMMMMMMMMMMMMMMMMMLMMMMMMMMMMMMM HI:i:1 nM:i:1 MQ:i:60 AS:i:298
    E00461:116:GW170602261:1:2111:12713:2276 83 chr20 80387 60 151M = 80381 -157 TCCTGAATCAACAGACCACTTGCAGATATACTTCACAGCCCACGCTGACTCTGCCAAGCACAGACAACCACTGGGCCCCAGGGGAGCTGCAGGTCTCCTGGTCACCTAATCTTTTTTTTTTTTATACTTTAAGTTTTGGGGTACATGTGCA A;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AA MC:Z:151M BD:Z:OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO PG:Z:MarkDuplicates RG:Z:@E00461_116_GW170602261_1 NH:i:1 BI:Z:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMLMMMMMMLMMMMMMMMMMMMMMMMMMMMMM HI:i:1 nM:i:1 MQ:i:60 AS:i:298

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Diora
    Hi,

    I need to see the header, not the records (so I can see what your SM tags look like). You can view the header using samtools view -H to just get the header lines.

    -Sheila

  • sorry.

    This is the @rg line in header, one sample:
    "@E00461_116_EM170602261_1" RGLB=S1_S13 RGPL=ILLUMINA
    RGPU="@E00461_116_EM170602261_1.S1_S13" RGSM=S1

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Diora
    Hi,

    I am not sure whether this will help, but perhaps try removing the RGs before the LB, PL, PU and SM? Have a look at this dictionary entry for more information on setting the reads groups.

    -Sheila

Sign In or Register to comment.