If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Problem with the hg38 bundle
I downloaded the hg38 bundle from:
When I aligned my data (BWM mem) to the fasta file Homo_sapiens_assembly38.fasta I noticed that there many regions (including genic regions) throughout the genome where I didn't get any reads that were mapped to them.
I suspect that the many contigs in the fasta file (named *_decoy or *_alt) have high similarity to parts of the genome, and hence I get multiple alignments with mapping quality 0.
When I used a fasta file that I downloaded from Ensembl Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa then this disappeared, and I was able to properly read variants.