Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
MNP and HaplotypeCaller GVCF mode
I am attempting to run HaplotypeCaller in a way that will merge adjacent SNPs into MNPs.
To do so I set --max-mnp-distance to 1 or 2.
This worked well when I did not used GVCF mode.
However, when I attempted this in GVCF mode I got the following error:
A USER ERROR has occurred: Illegal argument value: Non-zero maxMnpDistance is incompatible with GVCF mode.
(I am using GATK 220.127.116.11).
I am not sure I understand this conceptually:
If my callset contains two (or more) heterozygous SNPs that occur in adjacent genomic sites, they can only be determined to constitute part of a single MNP if both SNPs originate from the same chromsome/haplotype.
This is determined by phasing the callset, which as explained in the "Purpose and operation of Read-backed Phasing" page, is only enabled when HaplotypeCaller is run in GVCF or BP_RESOLUTION mode.
Following this reasoning it appears to me that merging SNPs into MNPs will only make sense in one of this modes since otherwise SNPs from different haplotypes can be merged erroneously.
Therefore I do not understand why in MNP merging possible without enabling GVCF mode, but is incompatible with GVCF mode.
I will be very glad for an explanation.