Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
ASEReadCounter contruct a heterozygous VCF input file for all nucleotides in region of interest
The aim is to identify any and all bi-allelic SNPs in my region of interest.
Then determine if there is allelic imbalance in all my RNA-seq samples.
I already filtered the BAM files for the region of interest and managed to produce a vcf file for the matching region.
However the documentation for the ASEReadCounter states that the input:
"A VCF file with specific sites to process."
Is it possible to construct a vcf file (of my region of interest) where I have bi-allelic information for every single base? How can this be done since there would be 3 different alternative alleles? Would this actually work in ASEReadCounter?
I'm trying to avoid making a vcf file for each sample and using the same vcf file as input for ASEReadCounter. Do I have the right understanding? In other words, where does the original vcf input file come from?