Question about FS score

MadhuMadhu United StatesMember
edited September 2013


I have the following variant called by Unified Genotyper (GATK version : GenomeAnalysisTK-2.6-5) :

chr9 139413211 . T G 7.60 . AC=1;AF=0.500;AN=2;BaseQRankSum=-7.913;DP=296;Dels=0.00;FS=37.414;HaplotypeScore=22.3462;MLEAC=1;MLEAF=0.500;MQ=70.00;MQ0=0;MQRankSum=0.508;QD=0.03;ReadPosRankSum=-3.354 GT:AD:DP:GQ:PL 0/1:180,115:282:35:35,0,3884

The FS score is 37.414. But a closer look at the bam file indicates that the 115 reads supporting alternate allele G are all in + strand. Shouldn't the FS score be much higher for this variant? 113 reads reads supporting the reference allele T at this position are in + strand and 67 are in - strand.

Please help me understand if I am wrong about my understanding of FS score or if this is a bug.


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Madhu,

    Hmm, at first glance I would also expect a higher FS value. Can you tell me if this data has been compressed with ReduceReads, by any chance?

  • MadhuMadhu United StatesMember

    Hi Geraldine,

    Thanks for your reply. No I didn't use ReduceReads.
    My pipeline includes: Novoalign -> PICARD (mark and remove duplicates) -> GATK.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    OK, just checking. RR has had a few bugs that caused annotation weirdness before.

    Here's a way to test what might be going on here. We recently added a new annotation (StrandBiasBySample) that is just the counts of the bases going into the FS calculation.  Could you use the Variant Annotator (with -A StrandBiasBySample) with the latest nightly build on your VCF at this one site? Then post the result here for us?

  • MadhuMadhu United StatesMember

    Hi Geraldine,

    Sure. I will try that and let you know.

