about Mutect2, Variant, Annotator and other FIELDS and COMPUTATIONS

BogdanBogdan Palo Alto, CAMember ✭✭

Dear Sheila, and Geraldine,

considering a vcf file from MUTECT2, please could you advise if it will be legitimate to run VariantAnnotator with the following calculations :

$GATK -T VariantAnnotator \
-R $REFERENCE_HG38 \
-I file.IR.RC.bam \
-V file.analysis-MUTECT2.vcf \
-A FisherStrand \
-A StrandOddsRatio \
-A BaseCounts \
-A HomopolymerRun \
-A MappingQualityRankSumTest \
-A QualByDepth \
-A RMSMappingQuality \
-A ReadPosRankSumTest \
-A VariantType \
-o output.vcf \
--disable_auto_index_creation_and_locking_when_reading_rods

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MA admin
    Accepted Answer

    @Bogdan, the filtering recommendations you reference are formulated specifically for germline variants. There are certain statistical assumptions that are made that may not hold for somatic variants. I can't tell you off the top of my head which would be more or less appropriate, but generally speaking I would say you should be extremely careful when attempting to translate germline guidelines into the somatic world. We'll try to document this in the future but in the meantime I would recommend you perform some careful evaluations of what seems appropriate on your data.

Answers

  • BogdanBogdan Palo Alto, CAMember ✭✭

    in addition, after collecting all these fields in the VCF file, would it be legitimate to do the filtering in a similar way as for HaplotypeCaller, i.e. to exclude the variants according to the hard filtering criteria from the Best Practices :smile:

    Fail SNPs that have:

    -­‐ QD < 2.0
    -­‐ FS > 60.0
    -­‐ MQ < 40.0
    -­‐ MQRankSum < -­‐12.5
    -­‐ ReadPosRankSum < -­‐8.0

    Fail INDELS that have:

    -­‐ QD < 2.0
    -­‐ FS > 200.0
    -­‐ ReadPosRankSum < -­‐20.0

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Accepted Answer

    @Bogdan, the filtering recommendations you reference are formulated specifically for germline variants. There are certain statistical assumptions that are made that may not hold for somatic variants. I can't tell you off the top of my head which would be more or less appropriate, but generally speaking I would say you should be extremely careful when attempting to translate germline guidelines into the somatic world. We'll try to document this in the future but in the meantime I would recommend you perform some careful evaluations of what seems appropriate on your data.

  • BogdanBogdan Palo Alto, CAMember ✭✭

    Although when running the Variant Annotator with the fields that I've described above, it only annotated in the output vcf file :

    FisherStrand
    StrandOddsRatio
    BaseCounts
    VariantType

    hmmm ....I was hoping that we could add QD and ReadPosRankSum . is there any way to add these information to the somatic mutation calls in the vcf file ? Thanks ;)

  • BogdanBogdan Palo Alto, CAMember ✭✭

    Thank you Geraldine ...yes, certainly, looking forward to use the new future versions of Mutect with additional features ;) ! thanks !

  • BogdanBogdan Palo Alto, CAMember ✭✭

    and thanks again for replying so late in the evening !

Sign In or Register to comment.