Attention:
The frontline support team will be offline as we are occupied with the GATK Workshop on March 21st and 22nd 2019. We will be back and available to answer questions on the forum on March 25th 2019.

Adding annotations to the format field

I am struggling with adding AlleleBalanceBySample and StrandBiasBySample to my joint VCF file. The data through joint VCF were generated with GATK 3.2-2, and I have tried using the VariantAnnotator as well as repeating the GenotypeGVCF step specifying the annotations. I have also tried using the GenotypeGVCFs tool from 3.3-0, but am trying to avoid running the HC again in the interest of time. For the strand bias by sample I can see why the VariantAnnotator approach would fail, as the information is not contained in the joint VCF file, but it is there in the GVCF. For the allele balance by sample it seems the info is already in the joint VCF, so either VariantAnnotation or rerunning GenotypeGVCFs seem like they should work.

I should add that when I do try to add these annotations using GenotypeGVCFs, they are noted in the header ##FORMAT section, but not in the actual F:O:R:M:A:T.

Any clues or hints (or admonitions!) much appreciated.

-erikt

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @erikt
    Hi erikt,

    Did you add the bam file in your command for Variant Annotator? Can you post the exact command you ran? Both of those annotations should work with Variant Annotator.

    -Sheila

  • erikterikt Member

    Hi Sheila,

    I did not! I didn't notice it in the VariantAnnotator arguments doc but I see it now - I guess it is an engine level argument so not described in the command argument descriptions. Below is my command; I'll try with the bams passed on.

    java -XX:+UseSerialGC -Xmx4g -jar $GATK \
        -R $REF \
        -T VariantAnnotator \
        --variant $VCF \
        -o $OUTPUT \
        --annotation AlleleBalance \
        --annotation HomopolymerRun \
        --annotation StrandOddsRatio \
        --annotation AlleleBalanceBySample \
        --annotation StrandBiasBySample

    Thanks!
    -erikt

  • erikterikt Member

    @Sheila I've tried passing a list of the bam files in to VariantAnnotator and while is works for the AB annotation, I'm still not getting any SB annotation in the FORMAT field. The strange thing is the annotation exists in the g.vcf files (in the header and in the FORMAT field of the body of the g.vcf, but is not being included in the joint vcf file, even when the annotation is explicitly requested.

    java -XX:+UseSerialGC -Xmx30g -jar $GATK \
        -R $REF \
        -T VariantAnnotator \
        -nt 12 \
        --variant $VCF \
        -L /home/toorens/resources/intervals/agilent/S0274956/S0274956_Covered.bed \
        -o $OUTPUT \
        -I bams.list \
        --annotation AlleleBalance \
        --annotation HomopolymerRun \
        --annotation StrandOddsRatio \
        --annotation AlleleBalanceBySample \
        --annotation StrandBiasBySample
  • erikterikt Member

    @Geraldine_VdAuwera I forgot to mention it, but I did try using the nightly build without success. I was hoping to have the SB annotation as a stand in for StrandAlleleCountsBySample without having to rerun the HaplotypeCaller on some WGS data. Thanks for the clarification, and as always for all the hard work!

Sign In or Register to comment.