We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!
Problem with allele specific annotation AS_QualByDepth (AS_QD) during variant calling
Hi GATK team,
First a big thank you for all your hard work in developing the tool and supporting the users!
I am trying out the allelic specific(AS) annotations in version 3.6. While I have gotten a few other AS annotations to properly show up in my VCF, I am having trouble getting the AS_QualByDepth in particular.
For example, I tried to call variant on a few samples at a specific locus with a "T" homopolymer run. I first ran HaplotypeCaller in the GVCF mode for each sample:
java -jar GenomeAnalysisTK.jar\ -T HaplotypeCaller \ --emitRefConfidence GVCF -variant_index_type LINEAR -variant_index_parameter 128000 \ -R ref_fasta \ -I sample_$i \ -L chr1:10348759-10348801 \ -A AS_StrandOddsRatio -A AS_FisherStrand -A AS_QualByDepth \ -A AS_BaseQualityRankSumTest -A AS_ReadPosRankSumTest -A AS_MappingQualityRankSumTest -o sample_$i.gvcf
I then did GenotypeGVCFs on all the samples together:
java -jar GenomeAnalysisTK.jar\ -T GenotypeGVCFs \ -R ref_fasta \ -V gvcf_list \ -L chr1:10348759-10348801 \ -A AS_StrandOddsRatio -A AS_FisherStrand -A AS_QualByDepth \ -A AS_BaseQualityRankSumTest -A AS_ReadPosRankSumTest -A AS_MappingQualityRankSumTest -o out.vcf
In the final joint-called VCF header, the following AS annotations all showed up.
##INFO=<ID=AS_BaseQRankSum,Number=A,Type=Float,Description="allele specific Z-score from Wilcoxon rank sum test of each Alt Vs. Ref base qualities"> ##INFO=<ID=AS_FS,Number=A,Type=Float,Description="allele specific phred-scaled p-value using Fisher's exact test to detect strand bias of each alt allele"> ##INFO=<ID=AS_MQRankSum,Number=A,Type=Float,Description="Allele-specific Mapping Quality Rank Sum"> ##INFO=<ID=AS_QD,Number=1,Type=Float,Description="Allele-specific Variant Confidence/Quality by Depth"> ##INFO=<ID=AS_RAW_BaseQRankSum,Number=1,Type=String,Description="raw data for allele specific rank sum test of base qualities"> ##INFO=<ID=AS_RAW_MQRankSum,Number=1,Type=String,Description="Allele-specific raw data for Mapping Quality Rank Sum"> ##INFO=<ID=AS_RAW_ReadPosRankSum,Number=1,Type=String,Description="allele specific raw data for rank sum test of read position bias"> ##INFO=<ID=AS_ReadPosRankSum,Number=A,Type=Float,Description="allele specific Z-score from Wilcoxon rank sum test of each Alt vs. Ref read position bias"> ##INFO=<ID=AS_SB_TABLE,Number=1,Type=String,Description="Allele-specific forward/reverse read counts for strand bias tests"> ##INFO=<ID=AS_SOR,Number=A,Type=Float,Description="Allele specific strand Odds Ratio of 2x|Alts| contingency table to detect allele specific strand bias">
However, in the INFO column, I only got the other AS annotations but not AS_QD.
chr1 10348779 . AT A,ATT 981.29 . AC=4,2;AF=0.333,0.167;AN=12;AS_BaseQRankSum=-1.087,-2.521;AS_FS=3.986,7.378;AS_MQRankSum=-1.130,-2.349;AS_ReadPosRankSum=-1.192,-1.396;AS_SOR=0.415,0.254;BaseQRankSum=-6.350e-01;ClippingRankSum=0.00;DP=627;ExcessHet=14.6052;FS=6.378;MLEAC=4,2;MLEAF=0.333,0.167;MQ=59.95;MQRankSum=0.00;QD=1.94;ReadPosRankSum=-1.050e-01;SOR=0.352 GT:AD:DP:GQ:PL 0/1:44,9,7:63:81:81,0,1033,93,844,1165 0/1:71,11,8:99:47:47,0,1659,110,1414,1803 0/1:54,15,7:81:99:205,0,1239,280,1087,1635 0/1:69,25,12:106:99:311,0,1603,336,1306,2058 0/2:55,11,22:94:99:291,233,1636,0,943,1294 0/2:61,11,14:91:14:92,14,1473,0,1071,1468
I also checked the individual sample gVCFs. Similarly, there is AS_QD in the header but not in the INFO column. I wondering if this might be a bug or I am doing something wrong.
Another curious thing I noticed is that in the VCF header, the other AS annotations all have "Number=A" but AS_QD has "Number=1". Don't know if this might be causing some problem.