This is a tricky question with no simple answer, but generally, we advise to stick with the default priors unless it is a very small cohort size. If you find that the ploidy calls with the default priors are not realistic, you can always go back and change them.
We have also moved to a new forum website, so for follow up questions, please post in the new forum, which you can find here.
No, our tools do not output a "0" for the alternate allele column. Can you please post the entire VCF record for those sites? Please also post the BAM and bamout files for those positions.
The germline resource acts differently from the panel of normals. Presence in the panel of normals causes a variant to be filtered, whereas it is not presence in the germline resource alone that matters. Rather, Mutect2 uses the population allele frequency (AF) info field from the resource to populate the POP_AF annotation, which is then used by Mutect2's probabilistic models for germline and contamination variants to decide whether to filter the call. Our most recent documentation for these models is here: https://github.com/broadinstitute/gatk/blob/5a74c30628cb87ff8db87f0db64e18b7bbdd767a/docs/mutect/mutect.pdf
To save runtime, Mutect2 does not bother genotyping variants that are almost certainly non-somatic. That is, if evidence in the pileup of bases shows a lot of variant reads in the normal or if the population allele frequency from the germline resource is very large (in tumor-only mode), Mutect2 pre-filters the variant without bothering to do the expensive steps of local assembly and realignment. The argument --genotype-germline-sites overrides this, so that all evidence of variation triggers assembly, realignment, and somatic genotyping. That is, by default you don't see every rejected germline variant in the vcf, with --genotype-germline-sites you do. They still get the germline filter, of course, but you see them.
@sabaferdous Could you post a few more screenshots of that same C->A on chr17, with one just bam at a time, and sorted by base so that the A is on top, for each of: the untrimmed cfDNA tumor sample, the untrimmed bamout, and the trimmed cfDNA tumor sample?
Also, could you post the VCF records from FilterMutectCalls for each of the dropped calls?
Hello, @nitha. Generally speaking, SplitVcfs is a simpler tool that operates at a lower-level than SelectVariants.
If all you are doing is splitting the indels and SNPs from your VCF files, then either tool is probably equivalently functional to you.
However, you should be aware that if there are any sites that have indels and SNPs together on the same line, then SplitVcfs will give you an error.
SelectVariants is better suited to handle these occurrences, and so it should probably be prioritized as the better module to use to do the kind of work you described.
You should use AD/DP because AF is the max-likelihood estimate of the allele fraction given that the variant exists. That is, its purpose is to characterize variants, not to be used for filtering.
That is our most latest one.
the tool clearly told you what is wrong , just take a look at this line:Input file c2.vcf has sample entries that don't match the other files.
Input file c2.vcf has sample entries that don't match the other files.
A short quote from the docs:
The input files must have the same sample and contig lists. An index file is created
and a sequence dictionary is required by default.
I don't know what you want to do, but a look at this thread maybe helpful.https://gatkforums.broadinstitute.org/gatk/discussion/53/combining-variants-from-different-files-into-one
@pavle_marinkovic It looks like we forgot to update the WDL when we recently changed FilterAlignmentArtifactsto realign locally-assembled unitigs. We don't run this tool very often and it slipped through our tests. Your fix, which we will duplicate, is correct.
Hi @mack812 , I will write a more detailed response to you tomorrow, but wanted to let you know that we found out by running in
--mitochondria-mode , the variant was recovered. Can you give that a try?