Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Obtaining the "AB" information from HaplotypeCaller and GVCFs
Using GATK 3.1-1, I seem to be unable to get the "AB" (AlleleBalance) annotation for the calls using the HaplotypeCaller -> GenotypeGVCFs pipeline, and I'm not sure how to get it. Our current pipeline (GATK 2.7-4, UnifiedGenotyper) requires this field to perform filtering, so this annotation is essential for us to upgrade to GATK 3.1.
My current pipeline of commands is as follows:
$ GenomeAnalysisTK-3.1-1 -T HaplotypeCaller -R human_g1k_v37_decoy.fasta --dbsnp dbsnp_137.b37.vcf.gz -I -L targets.GRCh37.bed -stand_emit_conf 10 -mbq 20 --downsample_to_coverage 300 -ERC GVCF -pairHMM VECTOR_LOGLESS_CACHING -variant_index_type LINEAR -variant_index_parameter 128000 -A HaplotypeScore -A MappingQualityRankSumTest -A ReadPosRankSumTest -A FisherStrand -A GCContent -A AlleleBalanceBySample -A AlleleBalance -A QualByDepth -o sample_gvcf.vcf.gz
$ GenomeAnalysisTK-3.1-1 -T GenotypeGVCFs -R human_g1k_v37_decoy.fasta --dbsnp dbsnp_137.b37.vcf.gz -V -L targets.GRCh37.bed -A QualByDepth -A HaplotypeScore -A MappingQualityRankSumTest -A ReadPosRankSumTest -A FisherStrand -A GCContent -A AlleleBalanceBySample -o joint_vcf.vcf.gz
Above, note that if "-A AlleleBalance" is given to GenotypeGVCFs, GATK crashes with a NullPointerException (AlleleBalance.java, line 66).
The command above is heavily adapted from the current pipeline; do you know what I might be doing wrong with the new and improved HaplotypeCaller?
Thanks so much for your help, and if you need any further information, please let me know.