marking variants in repeat regions

Will_GilksWill_Gilks University of Sussex, UKMember ✭✭


I'm hoping it's possible to mark variants in repetitive elements using variant filtration, so they can be selected out later. I've tried a few command variations, the most recent of which:

GenomeAnalysisTK -R local_reference/dm6.fa \ -T VariantFiltration \ -V my.vcf \ -filter "QD<5.0" -filterName "LowQD" \ -mask ../masking_intervals/dmel6_repMask.gatk.intervals -filterNotInMask -maskName "RepMask" \ -o my_marked.vcf

..fails with "No tribble type provided" for the intervals file.

Is this possible at all to do with VariantFiltration? Note that I'd rather mark the suspect variants for removal later, rather than just remove them altogether.

...actually in writing this, I just thought to create a new vcf of variants only in the intervals, and then use this vcf to screen out the original vcf..... Sound good ?


William Gilks

Best Answer


  • Will_GilksWill_Gilks University of Sussex, UKMember ✭✭

    To answer my own question:

    To make a vcf of variants in repeats only:
    GenomeAnalysisTK -R local_reference/dm6.fa \ -T SelectVariants \ -V somones.vcf \ -L ../masking_intervals/dmel6_repMask.gatk.intervals \ -o repeats.vcf

    To mark a vcf with variants that are in repeats, as well as other quality thresholds :
    GenomeAnalysisTK -R local_reference/dm6.fa \ -T VariantFiltration \ -V someones.vcf \ -filter "QD<3.0" -filterName "LowQD" \ -filter "FS>30.000" -filterName "hiFS" \ -mask repeats.vcf -maskName "RepEl" \ -G_filter "DP<10" -G_filterName "lowiDP" \ -o someones_marked.vcf

Sign In or Register to comment.