Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Ensembl ref and GATK
I have a cancer rna-seq bam file that I would like to use to call variants using MuTect2, but I ran into a couple of errors that I would appreciate some assistance with. Thank you.
The chr notations in my bam file is absent in the reference fasta (Homo_sapiens_assembly19.fasta). I'd rather not change my bam file header using sed and samtools reheader and rather append chr notation into the reference file. Will this mess up the corresponding snp (00-All.vcf)?
Is there a karytypically sorted Ensembl reference (and its corresponding dbSNP vcf) that I can use for GATK workflow?
The biggest issue is "Error: Discordant contig lengths: read MT LN=16571, ref MT LN=16569" when I use picard's reordersam function. This is because the read was aligned against Ensembl reference. Is there a quick fix to resolve this?
EDIT: I should mention that realigning my seq against GATK's reference is not an option because I don't have the original fastq files available.