Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ReadBackedPhasing Read Depths

brcopelandbrcopeland New York, New York, USAMember

Is it intended that ReadBackedPhasing does not propagate the read depths of merged SNVs (and if so, why not)? My lab is considering going back to the original and getting the minimum read depth and allele depths so as to have this information for such variants.

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @brcopeland

    Hi,

    Can you please post some before and after VCF records?

    Thanks,
    Sheila

  • brcopelandbrcopeland New York, New York, USAMember

    Hi Sheila, sure. I had never been encountering calls lacking AD/DP with HaplotypeCaller in v.3.6 previously, and the non-MNPs post-ReadBackedPhasing do retain read depths.

    Before:
    1 899937 rs143296006 G T 556.77 PASS AC=2;AF=1.00;AN=2;DP=14;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;POSITIVE_TRAIN_SITE;QD=29.00;SOR=5.283;VQSLOD=10.54;culprit=MQ;CAF=[0.2378,0.7622];COMMON=1;INT;KGPROD;KGPhase1;OTHERKG;R5;RS=143296006;RSPOS=899937;SAO=0;SSR=0;VC=SNV;VP=0x0500000a0001100016000100;WGT=1;dbSNPBuildID=134 GT:AD:DP:GQ:PGT:PID:PL 1/1:0,13:13:39:1|1:899928_G_C:585,39,0
    1 899938 rs147467971 G C 556.77 PASS AC=2;AF=1.00;AN=2;DP=14;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;POSITIVE_TRAIN_SITE;QD=26.28;SOR=5.283;VQSLOD=10.10;culprit=MQ;CAF=[0.2314,0.7686];COMMON=1;INT;KGPROD;KGPhase1;OTHERKG;R5;RS=147467971;RSPOS=899938;SAO=0;SSR=0;VC=SNV;VP=0x0500000a0001100016000100;WGT=1;dbSNPBuildID=134 GT:AD:DP:GQ:PGT:PID:PL 1/1:0,13:13:39:1|1:899928_G_C:585,39,0

    1 979097 . T C 1121.77 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=0.213;ClippingRankSum=0.00;DP=61;ExcessHet=3.0103;FS=1.172;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.00;POSITIVE_TRAIN_SITE;QD=22.44;ReadPosRankSum=-1.194e+00;SOR=0.473;VQSLOD=10.76;culprit=MQ GT:AD:DP:GQ:PGT:PID:PL 0/1:20,30:50:99:0|1:979090_C_A:1150,0,923
    1 979098 . C T 1121.77 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=-6.460e-01;ClippingRankSum=0.00;DP=59;ExcessHet=3.0103;FS=1.177;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.00;POSITIVE_TRAIN_SITE;QD=23.37;ReadPosRankSum=-9.090e-01;SOR=0.501;VQSLOD=10.66;culprit=MQ GT:AD:DP:GQ:PGT:PID:PL 0/1:20,28:48:99:0|1:979090_C_A:1150,0,923

    1 1770788 rs6657357 C G 49.77 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=0.00;ClippingRankSum=0.00;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.00;POSITIVE_TRAIN_SITE;QD=12.44;ReadPosRankSum=0.431;SOR=2.303;VQSLOD=11.01;culprit=MQ;CAF=[0.7948,0.2052];COMMON=1;G5;G5A;GNO;HD;INT;KGPROD;KGPhase1;KGPilot123;OTHERKG;PH3;RS=6657357;RSPOS=1770788;SAO=0;SLO;SSR=0;VC=SNV;VLD;VP=0x05010008000117051f000100;WGT=1;dbSNPBuildID=116 GT:AD:DP:GQ:PGT:PID:PL 0/1:2,2:4:78:0|1:1770788_C_G:78,0,136
    1 1770789 rs6665287 T A 49.77 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=1.38;ClippingRankSum=0.00;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.00;POSITIVE_TRAIN_SITE;QD=12.44;ReadPosRankSum=0.431;SOR=2.303;VQSLOD=10.48;culprit=MQ;CAF=[0.7948,0.2052];COMMON=1;G5;G5A;GNO;INT;KGPROD;KGPhase1;KGPilot123;OTHERKG;RS=6665287;RSPOS=1770789;SAO=0;SLO;SSR=0;VC=SNV;VLD;VP=0x05010008000117011e000100;WGT=1;dbSNPBuildID=116 GT:AD:DP:GQ:PGT:PID:PL 0/1:2,2:4:78:0|1:1770788_C_G:78,0,136

    After (the additional annotations are just from SnpEff having been run afterwards):
    1 899937 rs143296006;rs147467971 GG TC 556.77 PASS AC=2;AF=1.00;AN=2;ANN=TC|upstream_gene_variant|MODIFIER|PLEKHN1|ENSG00000187583|transcript|ENST00000379409|protein_coding||c.-1975_-1974delGGinsTC|||||1945|,TC|upstream_gene_variant|MODIFIER|PLEKHN1|ENSG00000187583|transcript|ENST00000379410|protein_coding||c.-1975_-1974delGGinsTC|||||1940|,TC|upstream_gene_variant|MODIFIER|PLEKHN1|ENSG00000187583|transcript|ENST00000379407|protein_coding||c.-1975_-1974delGGinsTC|||||1945|,TC|downstream_gene_variant|MODIFIER|KLHL17|ENSG00000187961|transcript|ENST00000463212|retained_intron||n.2079_2080delGGinsTC|||||2079|,TC|downstream_gene_variant|MODIFIER|KLHL17|ENSG00000187961|transcript|ENST00000466300|nonsense_mediated_decay||n.1171_1172delGGinsTC|||||27|,TC|downstream_gene_variant|MODIFIER|KLHL17|ENSG00000187961|transcript|ENST00000481067|retained_intron||n.393_394delGGinsTC|||||393|,TC|intron_variant|MODIFIER|KLHL17|ENSG00000187961|transcript|ENST00000338591|protein_coding|11/11|c.1700+27_1700+28delGGinsTC|||||| GT:GQ 1/1:39
    1 979097 . TC CT 1121.77 PASS AC=1;AF=0.500;AN=2;ANN=CT|missense_variant|MODERATE|AGRN|ENSG00000188157|transcript|ENST00000379370|protein_coding||c.1783_1784delTCinsCT|p.Ser595Leu|1833/7323|1783/6138|595/2045||,CT|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|transcript|ENST00000479707|retained_intron||n.-1682_-1681delTCinsCT|||||1682|,CT|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|transcript|ENST00000466223|retained_intron||n.-3484_-3483delTCinsCT|||||3484|,CT|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|transcript|ENST00000478677|retained_intron||n.-3758_-3757delTCinsCT|||||3758|,CT|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|transcript|ENST00000492947|retained_intron||n.-4812_-4811delTCinsCT|||||4812|,CT|downstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|transcript|ENST00000477585|processed_transcript||n.2992_2993delTCinsCT|||||2992|,CT|downstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|transcript|ENST00000469403|retained_intron||n.2320_2321delTCinsCT|||||2320| GT:GQ:HP:PQ 0/1:99:979097-1,979097-2:1160.19
    1 1770788 rs6657357;rs6665287 CT GA 49.77 PASS AC=1;AF=0.500;AN=2;ANN=GA|intron_variant|MODIFIER|GNB1|ENSG00000078369|transcript|ENST00000378609|protein_coding|1/11|c.-95-112_-95-111delAGinsTC||||||,GA|intron_variant|MODIFIER|GNB1|ENSG00000078369|transcript|ENST00000439272|protein_coding|1/7|c.-95-112_-95-111delAGinsTC||||||WARNING_TRANSCRIPT_INCOMPLETE,GA|intron_variant|MODIFIER|GNB1|ENSG00000078369|transcript|ENST00000434686|protein_coding|2/8|c.-95-112_-95-111delAGinsTC||||||WARNING_TRANSCRIPT_INCOMPLETE,GA|intron_variant|MODIFIER|GNB1|ENSG00000078369|transcript|ENST00000472614|processed_transcript|1/2|n.148-112_148-111delAGinsTC|||||| GT:GQ:HP:PQ 0/1:78:1770788-1,1770788-2:144.09

    Thanks,
    Brett

  • brcopelandbrcopeland New York, New York, USAMember
    edited December 2016

    This actually demonstrates another issue, which is not only are AD/DP not propagated, but neither is MQ, MQRankSum, VQSLOD, etc.

  • brcopelandbrcopeland New York, New York, USAMember

    Understood Geraldine; that sounds like an ideal situation (but I can understand how it would be a low priority). For now, we just wrote some code to go back to the original to get this information.

    Thank you,
    Brett

Sign In or Register to comment.