Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

AD value after GenotypeGVCFs

Hi,

I run HC for several samples, then I run GenotypeGVCFs to merge all gVCF files. (I'm using version 3.1.1)
I've noticed that for some variations, although there are only 2 alleles, the AD values for all samples contain information for multiple alleles.

For example: '0,83,0,0,0' for homozygous or '11,40,0,0,0' for heterozygous.

Below is the complete line:
chr15 82637079 . C T 91046.53 . AC=26;AF=0.867;AN=30;BaseQRankSum=-1.146e+00;ClippingRankSum=2.25;DP=2139;FS=0.000;InbreedingCoeff=-0.1588;MLEAC=26;MLEAF=0.867;MQ=39.29;MQ0=0;MQRankSum=2.18;QD=30.88;ReadPosRankSum=0.174 GT:AD:DP:GQ:PL 1/1:0,23,0,0,0:23:77:1081,77,0 0/1:11,40,0,0,0:51:99:1688,0,473 1/1:1,320,0,0,0:321:99:14836,972,0 1/1:2,168,0,0,0:170:99:7466,471,0 0/1:23,205,0,0,0:228:99:8646,0,649 ./.:.:3 1/1:0,3,0,0,0:3:9:135,9,0 1/1:0,139,0,0,0:139:99:6349,427,0 1/1:0,83,0,0,0:83:99:4251,307,0 0/1:7,60,0,0,0:67:99:2748,0,310 0/1:6,38,0,0,0:44:99:1921,0,124 1/1:0,294,0,0,0:294:99:15437,1085,0 1/1:0,67,0,0,0:67:99:3557,253,0 1/1:0,105,0,0,0:105:99:5289,362,0 1/1:0,243,0,0,0:243:99:11587,810,0 1/1:0,257,0,0,0:257:99:11929,821,0

Any suggestion how should I interpret this values?

Thank you for your help,
Lily

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Lily,

    Can you confirm that this is the output of GenotypeGVCFs and not CombineGVCFs? If so, can you tell me if this issue persists with the latest version (3.2)?

    By the way, the purpose of GenotypeGVCFs is not to merge the samples, it is to analyse them jointly. This is an important distinction.

  • LilyLily Member

    I used GenotypeGVCFs
    I did it as it was suggested in previous question I asked: " you will run HC on your samples individually to generate GVCFs, then run all your GVCFs through GenotypeGVCFs together. This will produce a multi-sample VCF that you can then put through VQSR."

    Should I use CombineGVCFs (after HC before VQSR)?

    Thanks,Lily

  • SheilaSheila Broad InstituteMember, Broadie admin

    @Lily‌

    Hi Lily,

    Can you please share a snippet of your data that is causing this behavior? We would like to reproduce it ourselves. Directions on how to upload files to our server are here: http://gatkforums.broadinstitute.org/discussion/1894/how-do-i-submit-a-detailed-bug-report

    Thanks,
    Sheila

  • LilyLily Member

    Hi Sheila,

    Sure I can share a snippet of the data, but, just to understand:
    What is better to use in order to merge the gVCFs files after HC and before VQSR: GenotypeGVCFs or CombineGVCFs?
    I haven't tried yet the GenotypeGVCFs in 3.2 version...

    Thanks,
    Lily

  • LilyLily Member

    Hi,

    I run GenotypeGVCFs with 3.2 version, and it looks good :-)

    Here is the newer output for the same example as above:
    chr15 82637079 . C T 91046.53 . AC=26;AF=0.867;AN=30;BaseQRankSum=-1.146e+00;ClippingRankSum=2.25;DP=2139;FS=0.000;GQ_MEAN=476.67;GQ_STDDEV=327.31;InbreedingCoeff=-0.1588;MLEAC=26;MLEAF=0.867;MQ=39.29;MQ0=0;MQRankSum=2.18;NCC=1;QD=27.84;ReadPosRankSum=0.174 GT:AD:DP:GQ:PL 1/1:0,23:23:77:1081,77,0 0/1:11,40:51:99:1688,0,473 1/1:1,320:321:99:14836,972,0 1/1:2,168:170:99:7466,471,0 0/1:23,205:228:99:8646,0,649 ./.:3,0:3 1/1:0,3:3:9:135,9,0 1/1:0,139:139:99:6349,427,0 1/1:0,83:83:99:4251,307,0 0/1:7,60:67:99:2748,0,310 0/1:6,38:44:99:1921,0,124 1/1:0,294:294:99:15437,1085,0 1/1:0,67:67:99:3557,253,0 1/1:0,105:105:99:5289,362,0 1/1:0,243:243:99:11587,810,0 1/1:0,257:257:99:11929,821,0

    Thanks for your help,
    Lily

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Great, we're glad to hear it!

    To clarify the workflow for anyone who is confused by this, CombineGVCFs is just an optional step to merge files when you have very many individual files to process. In contrast, the purpose of GenotypeGVCFs is not to merge files, but perform joint genotyping analysis, producing a multi-sample vcf file. So you either just run GenotypeGVCFs, or you run CombineGVCFs then GenotypeGVCFs. In any case you always need to run GenotypeGVCFs before VQSR.

Sign In or Register to comment.