If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Reference and Known input files in GATK hg38
1) dbSNP151 vcf file states that it uses as reference the GRCh38.p7. When I use dbSNP151 in GATK4 should I use this specific reference build or I can use whatever build I want, etc GRCh38.p12 (latest)?
2) Can I use whatever build of GRCh38.p* I want in VariantRecalibrator and use the same files used in this step from the bundle (1000G_phase1.snps.high_confidence.hg38.vcf.gz, 1000G_omni2.5.hg38.vcf.gz, hapmap_3.3.hg38.vcf.gz, etc). Or should I only use them with the specific Reference hg38 file from the bundle ?
3) Can I use 1000G.phase3.integrated.sites_only.no_MATCHED_REV.hg38.vcf in VariantRecalibrator instead of 1000G_phase1.snps.high_confidence.hg38.vcf.gz? What is exactly the first one? It is in the the cloud bundle but not in the ftp bundle(?!)
4) If I want to use the latest and best release from all of the files, which files should I use in every step?