Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Unclear error message on missing @RG tag in header

TechnicalVaultTechnicalVault Cambridge, UKMember ✭✭✭
edited January 2014 in Ask the GATK team

When GATK finds a read for which a corresponding @RG tag is missing in the header, the error message given implies that the read itself is lacking an RG tag rather than the header. Could this be fixed please so that the two error conditions are differentiated? It will save people time when debugging their pipelines if they don't have to go looking at the wrong thing.

ERROR MESSAGE: SAM/BAM file SAMFileReader{/lustre/blah/DDD_MAIN5247030.bam} is malformed: Read HS7_7515:4:2101:12189:66438#2 is missing the read group (RG) tag, which is required by the GATK. Please use to fix this problem

The reads have the RG tag but an @RG tag matching their ID does not exist in the header.

901282:HS7_7515:4:2101:12189:66438#2 99 1 37000590 60 75M = 37000629 114 * * X0:i:1 X1:i:0 BC:Z:CGATGTAT BD:Z:* MD:Z:75 PG:Z:MarkDuplicates RG:Z:1#2 BI:Z:* AM:i:37 NM:i:0 SM:i:37 MQ:i:60 QT:Z:BCAADFFE XT:A:U BQ:Z:*
901283:HS7_7515:4:2101:12189:66438#2 147 1 37000629 60 75M = 37000590 -114 * * X0:i:1 X1:i:0 BD:Z:* MD:Z:75 PG:Z:MarkDuplicates RG:Z:1#2 BI:Z:* AM:i:37 NM:i:0 SM:i:37 MQ:i:60 XT:A:U BQ:Z:*

P.S. your spam filter is stopping me posting discussions with URLs in, could you whitelist any gatkforums dot broad institute dot org urls?


Best Answer


Sign In or Register to comment.