Using SelectVariants to select for multiple expressions

sagipolanisagipolani Posts: 41Member

Hi,

I am using both GATK's UnifiedGenotyper and samtools mpileup as callers.

I've used CombineVariants in order to merge the two sets into a single .vcf file as follows:

java -Xmx4g -jar GenomeAnalysisTK.jar -T CombineVariants -R reference.fasta --variant:GATK GATK.vcf --variant:samtools samtools.vcf -o GATK_samtools.union.vcf -genotypeMergeOptions PRIORITIZE -priority GATK,samtools --filteredrecordsmergetype KEEP_UNCONDITIONAL

Now, I would like to select all calls that were called by both callers, regardless of whether they've been filtered or not.

From opening the GATK_samtools.union.vcf file, I understand that I need to select for the following expressions:

set=Intersection
set=FilteredInAll
set=filterInGATK-samtools

(I was also wondering why I don't get an expression like 'filterInsamtools-GATK'? does this have anything to do with the PRIORITIZE command?)

So... I've been trying to run the following with no luck (i.e. the output .vcf file doesn't contain any variants, but rather only the header):

java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -R reference.fasta --variant GATK_samtools.union.vcf -select 'set == "Intersection"; -select 'set == "FilteredInAll";' -select 'set == "filterInGATK-samtools";' -o GATK_samtools.overlap.vcf

I've also tried the following, but in this case I only get the an output of the 'set=Intersection' variants, without the rest:

java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -R reference.fasta --variant GATK_samtools.union.vcf -select 'set == 'Intersection';'FilteredInAll';'filterInGATK-samtools'" -o GATK_samtools.overlap.vcf

I'd appreciate any help on this.

Thanks!

Sagi

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,273Administrator, GATK Dev admin

    Hi Sagi,

    Have you looked at some records inside your combined variants file to check that the sets were annotated correctly?

    Geraldine Van der Auwera, PhD

  • sagipolanisagipolani Posts: 41Member

    Hi Geraldine,

    Yes of course. The annotations are fine.

    A little confused I must say... Any ideas?

    Thanks!

    Sagi

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,273Administrator, GATK Dev admin

    Hi Sagi,

    Sorry for not getting back to you earlier. Th only thing I can think of is, have you tried selecting on a single set at a time?

    Geraldine Van der Auwera, PhD

  • sagipolanisagipolani Posts: 41Member

    Hi Geraldine,

    The idea is to select for all shared variants across both cal sets... I don;t really see how I can do this without using CombineVariants and then SelectVariants...?

    Thanks!

    Sagi

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,273Administrator, GATK Dev admin

    I meant try to select eg the Intersection set and see if that works. If it does then you can move on to getting all of them; but first you want to make sure the tool is working properly on one set. This is to distinguish syntax issues from other problems.

    Geraldine Van der Auwera, PhD

  • sagipolanisagipolani Posts: 41Member

    Hi Geraldine,

    Yes it indeed works. How can I select for multiple expression at once?

    Thanks!

    Sagi

Sign In or Register to comment.