We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

JEXL method to get the current sample name from single sample VCF file

simono101simono101 London, UKMember

I am experimenting with JEXL expressions in order to do some hard filtering on variants. I wonder if there is a method to tell the filter expression to operate on the current sample (assuming here a single sample VCF file) without knowing the sample name a priori e.g.


Works just fine to sample heterozygous positions from a VCF where I know the sample will be called "Sample1", but can I refer to a sample name programmatically, e.g. something like: vc.getGenotype( sample() ).isHet()

Sorry if this is a really dumb question. (BTW I realise you could use a genotype_filter e.g. --genotypeFilterExpression "isHet == 1" to do the same thing, but I want to annotate the VCF in the FORMAT/FILTER field, rather than the INFO field for downstream variant selection operations.

Best Answer


  • simono101simono101 London, UKMember

    Hi @Geraldine_VdAuwera‌ thanks for replying. My solution was to use bcftools query in a script to get the sample name into a variable and then pass this in to the filter expressions, e.g.

    sample=$(bcftools query -l my.vcf)  
    java -jar ~/Downloads/GenomeAnalysisTK.jar \
        -T VariantFiltration \
        -R ref.fasta \
        --variant my.vcf \
        --filterExpression "vc.getGenotype(\"$sample\").isHet()" \
        --filterName "HET_POS" \
        -o my.flt.vcf`
Sign In or Register to comment.