How to create fasta file of sequences using VCF informations?
Hi GATK team,
I would like to retrieve the variants sequences for some genes of interest. Basically I have a BAM file of reads aligned to a reference genome, that I have used to create a VCF file. I realized that, for a given position, the BAM file has more reads than the number of reads counted in the VCF file for this same given position. Probably du to filtering step that is taking into account only reads with good quality etc...
Is there a way to select reads that had been taken into account in the VCF file? Where can I see the reads that participate to the variant and the one that had been filtered out?
Is there a ways to reconstruct fasta file with sequence of the variant by using the VCF file? I have a GFF file in hand too if it can help.
I am kind of lost right now and I can't find this informations anywhere on the web.
Thanks a lot for your help.