Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

JEXL method to get the current sample name from single sample VCF file

simono101simono101 London, UKMember

I am experimenting with JEXL expressions in order to do some hard filtering on variants. I wonder if there is a method to tell the filter expression to operate on the current sample (assuming here a single sample VCF file) without knowing the sample name a priori e.g.

vc.getGenotype("Sample1").isHet()

Works just fine to sample heterozygous positions from a VCF where I know the sample will be called "Sample1", but can I refer to a sample name programmatically, e.g. something like: vc.getGenotype( sample() ).isHet()

Sorry if this is a really dumb question. (BTW I realise you could use a genotype_filter e.g. --genotypeFilterExpression "isHet == 1" to do the same thing, but I want to annotate the VCF in the FORMAT/FILTER field, rather than the INFO field for downstream variant selection operations.

Best Answer

Answers

  • simono101simono101 London, UKMember

    Hi @Geraldine_VdAuwera‌ thanks for replying. My solution was to use bcftools query in a script to get the sample name into a variable and then pass this in to the filter expressions, e.g.

    #!/bin/zsh
    sample=$(bcftools query -l my.vcf)  
    java -jar ~/Downloads/GenomeAnalysisTK.jar \
        -T VariantFiltration \
        -R ref.fasta \
        --variant my.vcf \
        --filterExpression "vc.getGenotype(\"$sample\").isHet()" \
        --filterName "HET_POS" \
        -o my.flt.vcf`
    
Sign In or Register to comment.