If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

combining GVCF from multi-sample in 'Allele-specific' and standard modes

splaisansplaisan Leuven (Belgium)Member ✭✭
edited September 2018 in Ask the GATK team


I would like to put side by side the calls of my genome in standard HC mode and those in AS mode in order to evaluate the effect of the AS method. This is a diploid fly genome and I am looking for het variants associated with a phenotype, making the AS mode apriori attractive.

I read in article?id=9622 that for the conversion of gvfc to vcf in AS mode, the optional arguments '-G StandardAnnotation -G AS_StandardAnnotation' should be added to commands.

What about converting the merged gvcf to vcf with mixed types? Do I add the optargs there?
Can I do the following?

HaplotypeCaller 2x

maxram=48; # a lot of RAM on our server

# call GVCF from multi-sample in standard mode
gatk --java-options "-Xmx${maxram}g" HaplotypeCaller  \
   -R ${reference} \
   -I mappings/${bamfile} \
   -O variants/${prefix}.g.vcf.gz \

# call GVCF from multi-sample in 'Allele-specific' mode
gatk --java-options "-Xmx${maxram}g" HaplotypeCaller  \
   -R ${reference} \
   -I mappings/${bamfile} \
   -O variants/${prefix}.AS.g.vcf.gz \
   -ERC GVCF \
   -G StandardAnnotation \
   -G AS_StandardAnnotation \
   -G StandardHCAnnotation


# combining the two gvcf
gatk CombineGVCFs \
   -R ${reference} \
   --variant ${prefix}.g.vcf.gz \
   --variant ${prefix}.AS.g.vcf.gz \
   -O 2-samples.g.vcf.gz

There is only ONE columns in the GVCF output! :'(

=> this was because my two samples have the same name?


Do I add the 2 -G lines provided the first sample was NOT called in AS mode?

# converting to VCF
gatk --java-options "-Xmx${maxram}g" GenotypeGVCFs \
  -R ${reference} \
  -V 2-samples.g.vcf.gz \
  -G StandardAnnotation \
  -G AS_StandardAnnotation \
  -O 2-samples.vcf.gz

Thanks for any advice!

Post edited by splaisan on

Best Answer


Sign In or Register to comment.