We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

JEXL filtering using criteria for groups of individuals instead of all single?

medgenmedgen NorwayMember

I have used JEXL-filtering to specify variable criteria for each normal in a cohort of control sample (for comparison to affected samples), like " -select ' vc.getGenotype("Normal1").isHomRef() +all_other_individual_criteria ' ".
With increasing number of normals in the cohort it is inconvinient to specify criteria for each sample. Is there a shortcut; - a way to give in general criteria for "ALL normals"?



  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    edited October 2017

    From the source code of HTSJDK

    static {
            attributes.put("vc", (VariantContext vc) -> vc);
            attributes.put("CHROM", VariantContext::getContig);
            attributes.put("POS", VariantContext::getStart);
            attributes.put("TYPE", (VariantContext vc) -> vc.getType().toString());
            attributes.put("QUAL", (VariantContext vc) -> -10 * vc.getLog10PError());
            attributes.put("ALLELES", VariantContext::getAlleles);
            attributes.put("N_ALLELES", VariantContext::getNAlleles);
            attributes.put("FILTER", (VariantContext vc) -> vc.isFiltered() ? true_string : false_string);
            attributes.put("homRefCount", VariantContext::getHomRefCount);
            attributes.put("hetCount", VariantContext::getHetCount);
            attributes.put("homVarCount", VariantContext::getHomVarCount);

    I guess homVarCount, homRefCount and hetCount JEXL contexts are created for this purpose. You may need to fiddle a little with them. I can try that tomorrow I guess.

    Regardless if you have java experience or have someone with java experience next to you using HTSJDK and doing all this is much easier than figuring out the proper JEXL. I will try to give a feedback tomorrow about this issue.

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    edited October 2017

    I gave a shot to those JEXL statements and this example worked for me to get variants with 20 homRefCount

    -select 'homRefCount == 20'

    you may combine this with specific cases for your sample of interest for example if you have 100 samples and you are looking for 99 homRefs and 1 het or homVar use this one

    -select 'homRefCount == 99 && !vc.getGenotype("sampleofinterest").isHomRef()'

    this will give you het or homvar for your sample of interest and homRef for all the rest of it.

    You may use a threshold for

    homRefCount >= 80

    just in case for the samples without a call (no ref or alt) at the position.

    Word 'homRefCount' is hardcoded into the JEXL creator class however you are not bound by any of those and probably you may call any public method built into VariantContext class by calling vc.anyPublicMethodName() way.

    For example instead of homRefCount you may also use


    Both will give you the same results.

    Here is the class documentation.


    This will save you a lot of time.

    Post edited by SkyWarrior on
  • medgenmedgen NorwayMember

    Thanks- this looks very intersting and I look forward to try this !

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    Thanks for the tips! We usually recommend making a for loop :smiley:


  • medgenmedgen NorwayMember

    "We usually recommend making a for loop" - and that means...?

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    Ah, it is a programming term where you iterate through the items that you want to run the same procedure on. In this case, you would write a for loop that applies the filters to each normal sample in an iterative way. I think if you google "for loop" you will get some good articles :smile:


Sign In or Register to comment.