combinevariants after GATK 3.3 HC and UG calling

ano1986ano1986 BelgiumMember

Hi,

We do variant calling with UG (-glm SNPs) and HC (for both snps and indels) (gatk version 3.3) starting from the same sample.bam file.

At the end we would like to merge them in one .vcf.

  • We would like to have just sample column, as if we we're dealing with just one sample.
  • If the SNPs are called both with UG & HC, we would like to keep the exact variant line as it was present in the hc.vcf. So no recalculation of AC & DP values for example
  • If the SNPs were only called with UG, we would like to keep the exact variant line as it was present in the ug.vcf
  • for indels or snps that were only called with HC, we would like to keep the line as it was in hc. vcf

I've tried to accomplish this with CombineVariants using the following command:
java -jar GenomeAnalysisTK.jar -T CombineVariants -R hg19_chr1-y.fasta -V:ug ug.vcf -V:hc hc.vcf --genotypemergeoption UNIQUIFY -o union.vcf

However when doing this we create a file with two sample columns one for ug & one for hc. And when the variant is present both in the ug.vcf and the hc.vcf the values in the INFO field are recalculated.

Is there a way to accomplish what we want using the CombineVariants tool?

Tagged:

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @ano1986
    Hi,

    I think you can accomplish that by using --genotypemergeoption PRIORITIZE and --rod_priority_list [HC.vcf, UG.vcf].

    -Sheila

  • ano1986ano1986 BelgiumMember

    Thanks Sheila,

    I does mostly what I want.

    I get one 'SAMPLE' column where the FORMAT values are now as they should be.

    Only one small problem in the INFO field: The values for AC, AF & AN are correct. But there's still a recalculation for the quality parameters & the DP. When a variant is present both in ug.vcf & hc.vcf, the DP in the INFO field is the sum both depths in ug & hc. While I just want to take over the INFO values from hc when the variant is detected in both.

    Issue · Github
    by Sheila

    Issue Number
    575
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @ano1986
    Hi,

    It looks like the only option to deal with this is -mergeInfoWithMaxAC but that won't do what you want when UG variant has the higher AC. Let me see if I can put in a feature request for what you want (or perhaps there is some argument I am missing!)

    -Sheila

Sign In or Register to comment.