If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
VCF contigs don't match reference genome
I am working with fungi RNA-seq SNPs that I called using the GATK best practices pipeline. I have 12 vcf files of SNPs that I called using REFERENCE1.fa. I also have a vcf file from a collaborator with SNPs that were called using the same reference genome (REFERENCE1.fa). My goal was to combine all of my vcf files with those of my collaborator for phylogenetic tree data analysis. I was able to combine my 12 vcf files, but when I try to combine with my collaborator's file, I get an error saying that the contigs don't match. I looked more closely at both of our vcf files, and I noticed that there are two contigs present in my file that aren't present in their file. I found out that these contigs are for unmapped scaffolds, and mitochondria. I thought maybe things would work if I removed these two contigs from my vcf file. I used selectvariants to do this, and tried the merging again. Now I am getting an error saying that my vcf contigs do not match my reference genome. Is there a way to remove these contents from my reference genome as well, so that I won't get this error?