Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Running the HaplotypeCaller
I cannot figure out why I did not get the HaplotypeCaller to work properly. Maybe someone can please help me.
I wanted to use only Chromsome20 for variant calling. I used a BAM and BAM.BAI file from 1000 genomes (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/alignment/). For the reference genome I used only Chromsome20 from this site (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/). I created a dict and index file with picard tools.
If I run the HaplotypeCaller like this:
java -jar GenomeAnalysisTK.jar -R chr20.fa -T HaplotypeCaller -I HG00096.chrom20.ILLUMINA.bwa.GBR.low_coverage.20120522.bam
I get the following error:
##### ERROR MESSAGE: Badly formed genome loc: Contig 1 given as location, but this contig isn't present in the Fasta sequence dictionary
If I run the same with the -L argument (-L 20) I get:
#### ERROR MESSAGE: Input files reads and reference have incompatible contigs: The following contigs included in the intervals to process have different indices in the sequence dictionaries for the reads vs. the reference: . As a result, the GATK engine will not correctly process reads from these contigs. You should either fix the sequence dictionaries for your reads so that these contigs have the same indices as in the sequence dictionary for your reference, or exclude these contigs from your intervals. This error can be disabled via -U ALLOW_SEQ_DICT_INCOMPATIBILITY, however this is not recommended as the GATK engine will not behave correctly..
##### ERROR reads contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT, GL000207.1, GL000226.1, GL000229.1, GL000231.1, GL000210.1, GL000239.1, GL000235.1, GL000201.1, GL000247.1, GL000245.1, GL000197.1, GL000203.1, GL000246.1, GL000249.1, GL000196.1, GL000248.1, GL000244.1, GL000238.1, GL000202.1, GL000234.1, GL000232.1, GL000206.1, GL000240.1, GL000236.1, GL000241.1, GL000243.1, GL000242.1, GL000230.1, GL000237.1, GL000233.1, GL000204.1, GL000198.1, GL000208.1, GL000191.1, GL000227.1, GL000228.1, GL000214.1, GL000221.1, GL000209.1, GL000218.1, GL000220.1, GL000213.1, GL000211.1, GL000199.1, GL000217.1, GL000216.1, GL000215.1, GL000205.1, GL000219.1, GL000224.1, GL000223.1, GL000195.1, GL000212.1, GL000222.1, GL000200.1, GL000193.1, GL000194.1, GL000225.1, GL000192.1, NC_007605, hs37d5]
##### ERROR reference contigs = 
Additionally, I adapted the fasta, fasta index and dictionary file from calling the chromsome "chr20" to "20" because previously it said:
##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: No overlapping contigs found.
I really don't know what I am doing wrong! Any help is appreciated!