To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

marking variants in repeat regions

Will_GilksWill_Gilks University of Sussex, UKMember

Hi,

I'm hoping it's possible to mark variants in repetitive elements using variant filtration, so they can be selected out later. I've tried a few command variations, the most recent of which:

GenomeAnalysisTK -R local_reference/dm6.fa \ -T VariantFiltration \ -V my.vcf \ -filter "QD<5.0" -filterName "LowQD" \ -mask ../masking_intervals/dmel6_repMask.gatk.intervals -filterNotInMask -maskName "RepMask" \ -o my_marked.vcf

..fails with "No tribble type provided" for the intervals file.

Is this possible at all to do with VariantFiltration? Note that I'd rather mark the suspect variants for removal later, rather than just remove them altogether.

...actually in writing this, I just thought to create a new vcf of variants only in the intervals, and then use this vcf to screen out the original vcf..... Sound good ?

Sincerely,

William Gilks

Best Answer

Answers

  • Will_GilksWill_Gilks University of Sussex, UKMember

    To answer my own question:

    To make a vcf of variants in repeats only:
    GenomeAnalysisTK -R local_reference/dm6.fa \ -T SelectVariants \ -V somones.vcf \ -L ../masking_intervals/dmel6_repMask.gatk.intervals \ -o repeats.vcf

    To mark a vcf with variants that are in repeats, as well as other quality thresholds :
    GenomeAnalysisTK -R local_reference/dm6.fa \ -T VariantFiltration \ -V someones.vcf \ -filter "QD<3.0" -filterName "LowQD" \ -filter "FS>30.000" -filterName "hiFS" \ -mask repeats.vcf -maskName "RepEl" \ -G_filter "DP<10" -G_filterName "lowiDP" \ -o someones_marked.vcf

Sign In or Register to comment.