Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
SVPreprocess Mismatch found between genome mask and reference
I'm trying to execute the SVPreprocess step on 48 samples I have.
I've gotten the program to run but two of the SVPreprocesss*.out files contain errors/run failures.
In one: SVPreprocess-5.out we are running:
Program Name: org.broadinstitute.sv.apps.ComputeGenomeSizes Program Args: -O /home/dfermin/genomeStrip.gf48/metadata/genome_sizes.txt -R /home/dfermin/conf/hs37d5.fa -genomeMaskFile /home/dfermin/apps/masks/Homo_sapiens_assembly19.mask.101.fasta
The error message is:
Exception in thread "main" org.broadinstitute.sv.commandline.ArgumentException: Mismatch found between genome mask and reference sequence: Interval hs37d5:1-35477943 not found in genome mask
I'm guessing the issue is either:
1) A discrepancy between the header line of the FASTA file used from chromosome one for the initial BWA alignment (
/home/dfermin/conf/hs37d5.fa) and the header line of the FASTA file in
2) A discrepancy in the size of chromosome 1 in these two files.
The second error is in SVPreprocess-6.out:
Program Name: org.broadinstitute.sv.apps.ComputeGCProfiles Program Args: -O /home/dfermin/genomeStrip.gf48/metadata/gcprofile/reference.gcprof.zip -R /home/dfermin/conf/hs37d5.fa -md /home/dfermin/genomeStrip.gf48/metadata -writeReferenceProfile true -genomeMaskFile /home/dfermin/apps/masks/Homo_sapiens_assembly19.mask.101.fasta -configFile /home/dfermin/apps/genomeSTRIP/conf/genstrip_parameters.txt
The error here is:
Exception in thread "main" java.lang.RuntimeException: Invalid sequence position: hs37d5:201
This one I have no idea what's causing it.
Both errors look to be issues with the FASTA files involved but I can't figure out a correction.