Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

about Mutect2, Variant, Annotator and other FIELDS and COMPUTATIONS

BogdanBogdan Palo Alto, CAMember ✭✭

Dear Sheila, and Geraldine,

considering a vcf file from MUTECT2, please could you advise if it will be legitimate to run VariantAnnotator with the following calculations :

$GATK -T VariantAnnotator \
-R $REFERENCE_HG38 \
-I file.IR.RC.bam \
-V file.analysis-MUTECT2.vcf \
-A FisherStrand \
-A StrandOddsRatio \
-A BaseCounts \
-A HomopolymerRun \
-A MappingQualityRankSumTest \
-A QualByDepth \
-A RMSMappingQuality \
-A ReadPosRankSumTest \
-A VariantType \
-o output.vcf \
--disable_auto_index_creation_and_locking_when_reading_rods

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MA admin
    Accepted Answer

    @Bogdan, the filtering recommendations you reference are formulated specifically for germline variants. There are certain statistical assumptions that are made that may not hold for somatic variants. I can't tell you off the top of my head which would be more or less appropriate, but generally speaking I would say you should be extremely careful when attempting to translate germline guidelines into the somatic world. We'll try to document this in the future but in the meantime I would recommend you perform some careful evaluations of what seems appropriate on your data.

Answers

  • BogdanBogdan Palo Alto, CAMember ✭✭

    in addition, after collecting all these fields in the VCF file, would it be legitimate to do the filtering in a similar way as for HaplotypeCaller, i.e. to exclude the variants according to the hard filtering criteria from the Best Practices :smile:

    Fail SNPs that have:

    -­‐ QD < 2.0
    -­‐ FS > 60.0
    -­‐ MQ < 40.0
    -­‐ MQRankSum < -­‐12.5
    -­‐ ReadPosRankSum < -­‐8.0

    Fail INDELS that have:

    -­‐ QD < 2.0
    -­‐ FS > 200.0
    -­‐ ReadPosRankSum < -­‐20.0

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Accepted Answer

    @Bogdan, the filtering recommendations you reference are formulated specifically for germline variants. There are certain statistical assumptions that are made that may not hold for somatic variants. I can't tell you off the top of my head which would be more or less appropriate, but generally speaking I would say you should be extremely careful when attempting to translate germline guidelines into the somatic world. We'll try to document this in the future but in the meantime I would recommend you perform some careful evaluations of what seems appropriate on your data.

  • BogdanBogdan Palo Alto, CAMember ✭✭

    Although when running the Variant Annotator with the fields that I've described above, it only annotated in the output vcf file :

    FisherStrand
    StrandOddsRatio
    BaseCounts
    VariantType

    hmmm ....I was hoping that we could add QD and ReadPosRankSum . is there any way to add these information to the somatic mutation calls in the vcf file ? Thanks ;)

  • BogdanBogdan Palo Alto, CAMember ✭✭

    Thank you Geraldine ...yes, certainly, looking forward to use the new future versions of Mutect with additional features ;) ! thanks !

  • BogdanBogdan Palo Alto, CAMember ✭✭

    and thanks again for replying so late in the evening !

Sign In or Register to comment.