We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

VariantFiltration of 1/1 only

Hi,

I'm trying to use VariantFiltration to keep only SNPs with a 1/1 genotype

GenomeAnalysisTK -T VariantFiltration -R dm6.fa \
--variant H001.UniGenotyper.SNP.vcf \
--genotypeFilterExpression 'isHomVar==1' \
--genotypeFilterName "KeepHomVar" \
-o H001.homVar.SNP.vcf

The GATK log states:
INFO 14:27:26,801 HelpFormatter - Program Args: -T VariantFiltration -R /mnt/lustre/scratch/bioenv/wg39/LHm_analysis/reference_sequences/dmel/v6.0/dm6.fa --variant H001.UniGenotyper.SNP.vcf --genotypeFilterExpression isHomVar==1

However, both heterozygous and homozygous-variant SNPs are still outputted to the resulting vcf file.

Could you help me correct this ? I'm guessing it's that the filter expression is not correct but I've tried a few alternatives to no avail.

Cheers,

Blue

Best Answer

Answers

  • BlueBlue Member

    Solution using bash instead of GATK:

    keep only homozygous non-reference genotypes, and annotation lines.

    for i in $(ls *.UniGenotyper.SNP.vcf)
    do
    grep -E '1/1|#' $i > ${i//UniGenotyper.SNP.vcf/UG.SNP.11.vcf}
    done;

    # convert to table using gatk

    for j in *.UG.SNP.11.vcf
    do
    GenomeAnalysisTK \
    -R dm6.fa \
    -T VariantsToTable \
    -V $j \
    -F CHROM -F POS -F ID -F QUAL -F DP \
    -o ${j//.vcf/.tb}
    done;

Sign In or Register to comment.