Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

DiagnoseTargets NO_READS flagged where there are reads

Hi again,
In my diagnose targets output I have a lot of NO_READS, but most of the targets are ones that have a SNP called by HC on the same set of bam files:

# # # table(rowData(nextseq.dt.vcf)$FILTER)
# # #          LOW_COVERAGE LOW_COVERAGE;NO_READS              NO_READS 
# # #                  2136                  1114                 14775 
# # # NO_READS;POOR_QUALITY                  PASS          POOR_QUALITY 
# # #                   307                 56599                   479 

I have had a look at one site that is flagged as NO_READS, but there are reads:

line from unfiltered VCF from HC

chr1 109808776 . C G 106.25 . AC=1;AF=0.042;AN=24;BaseQRankSum=-1.059e+00;ClippingRankSum=-2.329e+00;DP=127;FS=0.000;GQ_MEAN=36.25;GQ_STDDEV=35.51;InbreedingCoeff=-0.0565;MLEAC=1;MLEAF=0.042;MQ=52.57;MQ0=0;MQRankSum=1.48;NCC=0;QD=6.64;ReadPosRankSum=-2.120e-01 GT:AD:DP:GQ:PL 0/0:4,0:4:12:0,12,87 0/0:12,0:12:36:0,36,297 0/0:7,0:7:21:0,21,169 0/0:21,0:21:60:0,60,514 0/0:8,0:8:24:0,24,172 0/0:6,0:6:6:0,6,90 0/0:11,0:11:24:0,24,254 0/0:11,0:11:27:0,27,288 0/0:11,0:11:27:0,27,255 0/1:9,7:16:99:141,0,1720/0:11,0:11:30:0,30,321 0/0:9,0:9:27:0,27,228

line from DiagnoseTargets

chr1 109808776 . C <DT> . NO_READS END=109808776;GC=1.00;IDP=165.00 FT:IDP:LL:ZL NO_READS:6.00:0:0 NO_READS:20.00:0:0 NO_READS:11.00:0:0 NO_READS:22.00:0:0 NO_READS:10.00:0:0 NO_READS:11.00:0:0 NO_READS:14.00:0:0 PASS:13.00:0:0 NO_READS:9.00:0:0 PASS:15.00:0:0 PASS:19.00:0:0 NO_READS:15.00:0:0

Is there any other filters that these reads are not passing - I couldn't spot anything in the documentation but I wouldn't put it past me!
I ran a fairly standard incantation of DiagnoseTargets, I did use --interval_merging OVERLAPPING_ONLY though.

Any thoughts?
Many thanks


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Anna,

    This could be due to the fact the HC does some reassembly of the reads --sometimes they get remapped a little differently, covered regions regions become uncovered, and vice versa. You should have HC generate the -bamout file of reassembled reads, and compare that to the original that you ran DT on, see if reads got moved around at all.

  • annatannat Member

    Hi @Geraldine_VdAuwera‌
    I don't think this is the case here, as there are no zero depths in the output from DiaganoseTargets for this line, I have also looked on IGV and all seems in order.
    Any other ideas?


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    If I recall correctly DiagnoseTargets does some thresholding over intervals so it may be giving an overall NO_READS judgment to an interval that has per-site NO_READS judgments for a certain proportion of sites.

    Sorry for the vagueness -- the coverage analysis tools are not as well documented as they need to be, and even on our end we have some uncertainties about the desired behaviors of certain parts of the tools. We are starting a new project to remedy this situation, which will involve sitting down with the developer who wrote the tool and writing out all these details which currently are recorded only in his brain!

  • annatannat Member

    Thanks Geraldine, I think I'll just use the depth from this tool for now.

Sign In or Register to comment.