HaplotypeCaller: -G Standard -G AS_Standard

MUHAMMADSOHAILRAZAMUHAMMADSOHAILRAZA Beijing Institute of Genomics, CASMember
edited July 2016 in GenomeSTRiP

Hi everyone,

I am running HaplotypeCAller of GATK-3.6, My command-line is:
java -Xmx20g -jar ./GATK-3.6/GenomeAnalysisTK.jar -T HaplotypeCaller \
-nct 30 -rf BadCigar -log $LOG/file.log \
-R $REF \
-I $BQSR/BQSR_Realign_Dedup_Sort_sample_PE.bam \
-D $KNOWN/dbsnp_138.b37.vcf \
-ERC GVCF \
--variant_index_type LINEAR \
--variant_index_parameter 128000 \
-G Standard -G AS_Standard \
-o $OUTPUT/RAW_sample_snp_indels_AS.g.vcf

In the output VCF file: for both in INFO header section and in the variant specific info section "ClippingRankSum" is missing. Moreover, In the INFO header section, Allele specific tags were mentioned:
INFO=<ID=AS_InbreedingCoeff,Number=A,Type=Float,Description="allele specific heterozygosity as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg ex
INFO=<ID=AS_QD,Number=1,Type=Float,Description="Allele-specific Variant Confidence/Quality by Depth">
INFO=<ID=AS_RAW_BaseQRankSum,Number=1,Type=String,Description="raw data for allele specific rank sum test of base qualities">
INFO=<ID=AS_RAW_MQ,Number=A,Type=Float,Description="Allele-specfic raw data for RMS Mapping Quality">
INFO=<ID=AS_RAW_MQRankSum,Number=1,Type=String,Description="Allele-specific raw data for Mapping Quality Rank Sum">
INFO=<ID=AS_RAW_ReadPosRankSum,Number=1,Type=String,Description="allele specific raw data for rank sum test of read position bias">
INFO=<ID=AS_SB_TABLE,Number=1,Type=String,Description="Allele-specific forward/reverse read counts for strand bias tests">

But they are missing in the variants INFO section.

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Sample7 Sample8 Sample9 Sample10 Sample11 Sample12

1 12783 . G A 1436.11 . AC=5;AF=0.278;AN=18;BaseQRankSum=3.76;DP=136;ExcessHet=9.5122;FS=0.000;MLEAC=7;MLEAF=0.389;MQ=26.36;MQRankSum=-7.480e-01;QD=12.60;ReadPosRankSum=0.067;SOR=0.765 GT:AD:DP:GQ:PL ./.:0,0:0:.:0,0,0 ./.:0,0:0:.:0,0,0 0/0:15,0:15:0:0,0,306 0/0:3,0:3:0:0,0,22 0/0:1,0:1:3:0,3,29 0/0:3,0:3:0:0,0,30 0/1:10,4:.:80:80,0,243 0/1:12,9:.:99:217,0,254 ./.:0,0:0:.:0,0,0 0/1:7,10:.:99:262,0,134 0/1:6,5:.:99:124,0,142 0/1:20,31:.:99:790,0,432

1 12807 . C T 222.05 . AC=2;AF=0.100;AN=20;BaseQRankSum=3.75;DP=222;ExcessHet=3.2451;FS=0.000;InbreedingCoeff=-0.1126;MLEAC=2;MLEAF=0.100;MQ=26.32;MQRankSumm=-1.806e+00;QD=2.27;ReadPosRankSum=1.74;SOR=0.061 GT:AD:DP:GQ:PL ./.:0,0:0:.:0,0,0 ./.:0,0:0:.:0,0,0 0/0:27,0:27:81:0,81,855 0/0:13,0:13:39:0,39,451 0/0:6,0:6:18:0,18,199 0/0:12,0:12:36:0,36,407 0/0:24,0:24:42:0,42,741 0/1:29,8:.:99:115,0,790 0/0:7,0:7:21:0,21,242 0/0:20,0:20:60:0,60,672 0/0:15,0:15:45:0,45,427 0/1:50,11:.:99:148,0,1325

So My questions are:

  1. ClippingRankSum parameter is missing with -G Standard -G AS_Standard parameters, is it normal??
  2. Allele Specific annotations (i.e. present in INFO header were not added in variant section, Why?
  3. Why the InbreedingCoeff is not calculated for every variant?

Thanks..

Answers

  • MUHAMMADSOHAILRAZAMUHAMMADSOHAILRAZA Beijing Institute of Genomics, CASMember
    edited July 2016

    The resultant VCF records above were shown after joint genotyping with "GenotypeGVCFs"... with command line:
    java -Xmx20g -jar ./GATK-3.6/GenomeAnalysisTK.jar -T GenotypeGVCFs \
    -nt 50 -rf BadCigar -log $LOG/QJ-Joint-HC.log \
    -R $REF \
    -V $OUTPUT/RAW_Sample1_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample2_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample3_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample4_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample5_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample6_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample7_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample8_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample9_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample10_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample11_snp_indels_AS.g.vcf \
    -V $OUTPUT/RAW_Sample12_snp_indels_AS.g.vcf \
    -D $KNOWN/dbsnp_138.b37.vcf \
    -o $JOINT/RAW_All_snp_indels_12SAMPLE.vcf

    May be i did not mention -G parameter in the command-line while joint genotyping that's why the Allele Specific annotations were absent... But what about the ClippingRankSum and imbreedingcoefficient annotation?

    I also noted some instances where in the absence of -G standard -G AS_Standard tags in command-line, some annotations that were appeared by default in previous GATK versions such as MQ is missing in the INFO field in GATK-3.4-46..

This discussion has been closed.