Mutect2 version 4.0.11.0 still get empty vcf with one sample for PON,and filter 1000 sites

I get a empty vcf in pon, and to my surprise, it filter 1000 site, thanks a lot

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @picard_gatk_mj

    Please post the exact command you are using, the version of gatk and the entire error log.

    Regards
    Bhanu

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @picard_gatk_mj We'll want to know both your Mutect2 command and the command you used to generate your pon.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @manba

    Most mate pair information errors can be fixed with FixMateInformation.

    Here is a document that clearly describes the differences between mutect2 in gatk3 and gatk4.

    Regards
    Bhanu

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @manba

    Another major difference is in site versus allele filtering against the germline resource. GATK3 MuTect2 prefilters sites in the germline resource regardless of the allele in the tumor. GATK4 Mutect2 distinguishes alleles in the germline resource and only filters the site if the tumor allele matches. If the alleles are different, then the tool considers the allele a putative somatic mutation.

    This means that in gatk3 mutect2 will filter a variant as germline if that position exists in the normal. But in Gatk4 it will look at the position and the same allele, in order to filter it as germline. If it is a different allele than the normal, then it is considered as a potential somatic variant.

    GATK4 breaks off filtering into a separate tool, FilterMutectCalls.

    In GATK4, Mutect2 is focused mostly on calling and does some minimal upfront filtering of obvious non-somatic sites. However, it leaves the majority of filtering to FilterMutectCalls.

    "-I tumor-pileups.table, " but I do not know how to get the tumor-pileups.table.

    In GATK4, we recommend including cross-sample contamination estimates from CalculateContamination when filtering with FilterMutectCalls. CalculateContamination, in turn, relies on the results of GetPileupSummaries and can incorporate information from the matched normal, if available, when calculating the contamination in the tumor sample.

    All this information is already provided in the document that was provided to you. I hope this helps. Sorry this is the maximum bandwidth i can provide and will be closing this issue now.

    Regards
    Bhanu

Sign In or Register to comment.