Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Problem with the hg38 bundle

GilHornungGilHornung Weizmann InstituteMember

Hi,

I downloaded the hg38 bundle from:
https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/

When I aligned my data (BWM mem) to the fasta file Homo_sapiens_assembly38.fasta I noticed that there many regions (including genic regions) throughout the genome where I didn't get any reads that were mapped to them.
I suspect that the many contigs in the fasta file (named *_decoy or *_alt) have high similarity to parts of the genome, and hence I get multiple alignments with mapping quality 0.

When I used a fasta file that I downloaded from Ensembl Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa then this disappeared, and I was able to properly read variants.

Gil

Best Answer

Answers

Sign In or Register to comment.