Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Error in GatherBQSRReports

manolismanolis Member ✭✭
edited February 2018 in Ask the GATK team

Hi,

starting from WES pair-end data...

in the BaseRecalibrator step I used as interval these regions/chr:

chr1-22, chrX, chrY, chrM, alternative contings and HLA contings. In the case of HLAname I added at the end of the name ":1+"

Next , in the GatherBQSRReports step I saw in the log file this "error":

### START LOG FILE

....
05:24:49.607 INFO GatherBQSRReports - Done initializing engine
05:24:54.887 INFO RecalibrationReport - Missing read group(s): /home/manolis/GATK4/IlluminaExomePairEnd/4.BAM/processing/15_1143_030_recaldata.csv
05:24:54.916 INFO RecalibrationReport - A00125.27
05:24:54.917 INFO RecalibrationReport - Missing read group(s): /home/manolis/GATK4/IlluminaExomePairEnd/4.BAM/processing/15_1143_031_recaldata.csv
05:24:54.917 INFO RecalibrationReport - A00125.27
...
...
05:24:54.972 INFO RecalibrationReport - Missing read group(s): /home/manolis/GATK4/IlluminaExomePairEnd/4.BAM/processing/15_1143_811_recaldata.csv
05:24:54.972 INFO RecalibrationReport - A00125.27
05:24:59.650 INFO GatherBQSRReports - Shutting down engine
[February 20, 2018 5:24:59 AM CET] org.broadinstitute.hellbender.tools.walkers.bqsr.GatherBQSRReports done. Elapsed time: 0.17 minutes.
Runtime.totalMemory()=5915017216

### END LOG FILE

15_1143 = Sample ID; 030, 031... 811 custom chr/conting ID

[ "," is intended as "\n"]
custom chr/conting ID: 1-25 = chr1-22,X,Y,M (example: chr1 , ... , chrX , chrY , chrM)
custom chr/conting ID: 26-286 = Alternative contings (example: chr1_KI270762v1_al t, chr1_KI270766v1_al, ...)
custom chr/conting ID: 287-811= HLA contings (example: HLA-A01:01:01:01:1+ , HLA-A01:01:01:02N:1+)

custom chr/conting ID reported "Missing read group(s)"... example

ID 30: chr1_GL383518v1_alt
ID 31: chr1_GL383519v1_alt
...
ID 811: HLA-DRB1*16:02:01:1+

"Missing read group(s)" progress ID

030
031
055
061
077
104
111
118
150
176
191
212
213
214
215
... ... ...

The ID not reported in the log file are ok... I think :neutral: All downstream steps, until the HaplotypeCaller step work. I have not get tried the steps downstream of HaplotypeCaller...

This "Missing read group(s) error is repeated 563 times. The number of chr/contings in the interval list is 811.

Any suggestion?

All the best

Best Answers

Answers

  • manolismanolis Member ✭✭

    Hi, I add the group informations during the conversion of a fastq file to uBAM file.

    echo "> Create uBAM starting from fastq trimmed files"

    java -jar ${ph3} FastqToSam F1=${val1} F2=${val2} O=${uBAM} SO=queryname RG="${PU1}.${PU2}" SM=${SM} LB=${LB} PL=${PL}

    I will try to apply the suggestions of the article.

    I will let you know, many thanks!

  • manolismanolis Member ✭✭
    edited March 2018

    GATK v4.0.2.1

    Hi,

    I was adding RG informations only in the FastToSam step (uBAM), but now also in the BWA step. At the last step "SetNmAndUqTags", after ValidateSamFile I don't have any error ("No errors found").

    Now, I'm working only with "chr1-22,X,Y".

    I think that everything is ok, at least using only the above chr.

    Best

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    Accepted Answer

    Great to hear.

Sign In or Register to comment.