Mutect2 version 4.0.11.0 still get empty vcf with one sample for PON,and filter 1000 sites

I get a empty vcf in pon, and to my surprise, it filter 1000 site, thanks a lot

Answers

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @picard_gatk_mj

    Please post the exact command you are using, the version of gatk and the entire error log.

    Regards
    Bhanu

  • davidbendavidben BostonMember, Broadie, Dev ✭✭

    @picard_gatk_mj We'll want to know both your Mutect2 command and the command you used to generate your pon.

  • manbamanba Member

    I validate my bam with you gatk command, it said 'validatesamfile value was put into pairinfomap more than once', I do this,because gatk4 mutect2 has a nullpionterexception

  • manbamanba Member

    thanks a lot, due to foridden to ask new questions, I want to ask the post written by about GATK3 and 4 mutect2 difference, a form describe the base G C, I can not understand, thanks a lot

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @manba

    Most mate pair information errors can be fixed with FixMateInformation.

    Here is a document that clearly describes the differences between mutect2 in gatk3 and gatk4.

    Regards
    Bhanu

  • manbamanba Member

    thanks.
    I am talk about this link, because I can not post links and pictures, so I have to say that the form

    germlime resource - G PoN - G normal G
    tumor C C C C C C
    somatic? Y Y Y N Y N

    Another major difference is in site versus allele filtering against the germline resource. GATK3 MuTect2 prefilters sites in the germline resource regardless of the allele in the tumor. GATK4 Mutect2 distinguishes alleles in the germline resource and only filters the site if the tumor allele matches. If the alleles are different, then the tool considers the allele a putative somatic mutation.

    can you explain the bolden sentence with the form, I can not get something from the form accrording to the sentence.

    what more, this article said "GATK4 breaks off filtering into a separate tool, FilterMutectCalls. " , whether it means in gatk4 , if I just use Mutect2 to get the vcf and do not run FilterMutectCalls, it is a big error, am I right.

    another question is that, FilterMutectCalls need the result of CalculateContamination, and CalculateContamination need "-I tumor-pileups.table, " but I do not know how to get the tumor-pileups.table.

    thanks a lot

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @manba

    Another major difference is in site versus allele filtering against the germline resource. GATK3 MuTect2 prefilters sites in the germline resource regardless of the allele in the tumor. GATK4 Mutect2 distinguishes alleles in the germline resource and only filters the site if the tumor allele matches. If the alleles are different, then the tool considers the allele a putative somatic mutation.

    This means that in gatk3 mutect2 will filter a variant as germline if that position exists in the normal. But in Gatk4 it will look at the position and the same allele, in order to filter it as germline. If it is a different allele than the normal, then it is considered as a potential somatic variant.

    GATK4 breaks off filtering into a separate tool, FilterMutectCalls.

    In GATK4, Mutect2 is focused mostly on calling and does some minimal upfront filtering of obvious non-somatic sites. However, it leaves the majority of filtering to FilterMutectCalls.

    "-I tumor-pileups.table, " but I do not know how to get the tumor-pileups.table.

    In GATK4, we recommend including cross-sample contamination estimates from CalculateContamination when filtering with FilterMutectCalls. CalculateContamination, in turn, relies on the results of GetPileupSummaries and can incorporate information from the matched normal, if available, when calculating the contamination in the tumor sample.

    All this information is already provided in the document that was provided to you. I hope this helps. Sorry this is the maximum bandwidth i can provide and will be closing this issue now.

    Regards
    Bhanu

  • manbamanba Member

    All this information is already provided in the document that was provided to you. I hope this helps. Sorry this is the maximum bandwidth i can provide and will be closing this issue now.

    thanks a lot

  • manbamanba Member

    I undestand that, gatk3 firstly compare normal or pon with reference, if not same, it is germline and filtered, but gatk4 compare normal or pon with reference, if not same, but only tumor is the same with normal, it can be filtered

Sign In or Register to comment.