How to select variants which are occur in multiple samples

airtimeairtime Member
edited August 2014 in Ask the GATK team

Hi,

I want to select variants which are occur in more than one sample in a *.vcf file. So I try many different commands of the SelectVariants walker.
For one specific sample I get all variants with this command:

java -Xmx30G -jar GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar -T SelectVariants -R gatk_karyotypic_hg19.fasta --variant snp.raw.vcf -o selected_raw.vcf -select '(vc.getGenotype("s_1")!= null && !vc.getGenotype("s_1").isHomRef())'

But if I extend the command by logical expressions like this:

java -Xmx30G -jar GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar -T SelectVariants -R gatk_karyotypic_hg19.fasta --variant snp.raw.vcf -o selected_raw.vcf -select '(vc.getGenotype("s_1")!= null && !vc.getGenotype("s_1").isHomRef()) && (vc.getGenotype("s_2")!= null && !vc.getGenotype("s_2").isHomRef())'

I get the following error:

Invalid JEXL expression detected for select-0 with message jexl.null

I try to find the JEXL-error in my expression extension, but for me it looks logical.

So is there an error which I couldn't see?

Or is there an better way to find variants which occur in more than one sample.

Thanks,
Air

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @airtime‌

    Hi Air,

    Sorry for the late response.

    This may be a syntax error. You can try removing the parentheses that separate the two sample statements like this:

    -select '(vc.getGenotype("s_1") != null && !vc.getGenotype("s_1").isHomRef() && vc.getGenotype("s_2") != null && !vc.getGenotype("s_2").isHomRef())'
    

    I hope this works!

    -Sheila

  • airtimeairtime Member

    Hi Sheila,
    thank you for response, but till now I prepare a python script to manage this.
    In the next time I would try it out and give feedback if it works.

    Best Air

Sign In or Register to comment.