If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
How to mark specific individual genotype as No Call
I have been trying to find a way to mark specific individual genotypes as No Call.
I know that in VariantFiltration it is possible to add the option --setFilteredGTToNocall in order to mark filtered genotypes as no call. However, in my case, there is no available filter corresponding to my criteria. Let me explain. I have some diploid-haploid paired related samples, i.e. I expect the haploid individual to share one of the two alleles from its diploid related. Therefore, for positions where there are discordant genotypes, I would like to mark individual genotypes as as no call in order to not include potentials errors in my data. I couldn't find a way to apply this rational to the pipeline of VariantFiltration or SelectVariants...
Eventually, I manage to manipulate the .vcf file by myself to mark any discordant genotypes as . or ./. However, when later I want to select the positions for which there are maximum of 0.2 ratio of no call (using the option --maxNOCALLfraction of SelectVariants from the nightly build version), there was an error message "there aren't enough comumns for line...". I suppose that it's because I did manipulate the vcf file by myself and there could have been errors (although I'm quite confident with my script).
Therefore, I think it's better to not manipulate the vcf file and try to do things using the pipeline....but how?