Problem with the hg38 bundle

GilHornungGilHornung Weizmann InstituteMember

Hi,

I downloaded the hg38 bundle from:
https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/

When I aligned my data (BWM mem) to the fasta file Homo_sapiens_assembly38.fasta I noticed that there many regions (including genic regions) throughout the genome where I didn't get any reads that were mapped to them.
I suspect that the many contigs in the fasta file (named *_decoy or *_alt) have high similarity to parts of the genome, and hence I get multiple alignments with mapping quality 0.

When I used a fasta file that I downloaded from Ensembl Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa then this disappeared, and I was able to properly read variants.

Gil

Best Answer

Answers

Sign In or Register to comment.